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Preface 


... that departed from the traditional dry-as-dust mathematics textbook. 
(M. Kline, from the Preface to the paperback edition of Kline 1972) 


Also for this reason, I have taken the trouble to make a great number of 
drawings. (Brieskorn & Knorrer, Plane algebraic curves, p. ii) 


... I should like to bring up again for emphasis ... points, in which my 
exposition differs especially from the customary presentation in the text- 
books: 

1. Illustration of abstract considerations by means of figures. 

2. Emphasis upon its relation to neighboring fields, such as calculus of dif- 
ferences and interpolation . .. 

3. Emphasis upon historical growth. 

It seems to me extremely important that precisely the prospective teacher 
should take account of all of these. (F. Klein 1908, Engl. ed. p. 236) 


Traditionally, a rigorous first course in Analysis progresses (more or less) in the 
following order: 


an limits, 
nia > dee => continuous = derivatives = integration. 
PPIng functions 


On the other hand, the historical development of these subjects occurred in reverse 
order: 
Archimedes 
< Kepler 1615 
Fermat 1638 


Cantor 1875 [= Cauchy 1821 = Newton 1665 
Dedekind Weierstrass Leibniz 1675 


In this book, with the four chapters 


ChapterI. Introduction to Analysis of the Infinite 
Chapter II. Differential and Integral Calculus 
Chapter III. Foundations of Classical Analysis 
Chapter IV. Calculus in Several Variables, 


we attempt to restore the historical order, and begin in Chapter I with Cardano, 
Descartes, Newton, and Euler’s famous /ntroductio. Chapter I then presents 17th 
and 18th century integral and differential calculus “on period instruments” (as a 
musician would say). The creation of mathematical rigor in the 19th century by 
Cauchy, Weierstrass, and Peano for one and several variables is the subject of 
Chapters III and IV. 

This book is the outgrowth of a long period of teaching by the two authors. 
In 1968, the second author lectured on analysis for the first time, at the University 
of Innsbruck, where the first author was a first-year student. Since then, we have 
given these lectures at several universities, in German or in French, influenced by 
many books and many fashions. The present text was finally written up in French 
for our students in Geneva, revised and corrected each year, then translated into 
English, revised again, and corrected with the invaluable help of our colleague 
John Steinig. He has corrected so many errors that we can hardly imagine what 
we would have done without him. 


vi Preface 


Numbering: each chapter is divided into sections. Formulas, theorems, fig- 
ures, and exercises are numbered consecutively in each section, and we also in- 
dicate the section number, but not the chapter number. Thus, for example, the 
7th equation to be labeled in Section II.6 is numbered “(6.7)”. References to this 
formula in other chapters are given as “(II.6.7)”. 

References to the bibliography: whenever we write, say, “Euler (1737)” or 
“(Euler 1737)”, we refer to a text of Euler’s published in 1737, detailed references 
to which are in the bibliography at the end of the book. We occasionally give more 
precise indications, as for instance “(Euler 1737, p. 25)”. This is intended to help 
the reader who wishes to look up the original sources and to appreciate the often 
elegant and enthusiastic texts of the pioneers. When there is no corresponding 
entry in the bibliography, we either omit the parentheses or write, for example, 
“Gn 1580)”. 

Quotations: we have included many quotations from the literature. Those ap- 
pearing in the text are usually translated into English; the non-English originals 
can be consulted in the Appendix. They are intended to give the flavor of math- 
ematics as an international science with a long history, sometimes to amuse, and 
also to compensate those readers without easy access to a library with old books. 
When the source of a quotation is not included in the bibliography, its title is indi- 
cated directly, as for example the book by Brieskorn and Knorrer from which we 
have quoted above. 

Acknowledgments: the text was processed in plain TgX on our Sun work- 
stations at the University of Geneva using macros from Springer-Verlag New 
York. We are grateful for the help of J.M. Naef, “Mr. Sun” of the “Services In- 
formatiques” of our university. The figures are either copies from old books (pho- 
tographed by J.M. Meylan from the Geneva University Library and by A. Perru- 
choud) or have been computed with our Fortran codes and included as Postscript 
files. The final printing was done on the 1200dpi laser printer of the Psychology 
Department in Geneva. We also thank the staff of the mathematics department 
library and many colleagues, in particular R. Bulirsch, P. Deuflhard, Ch. Lubich, 
R. Marz, A. Ostermann, J.-Cl. Pont, and J.M. Sanz-Serna for valuable comments 
and hints. Last but surely not least we want to thank Dr. Ina Lindemann and her 
équipe from Springer-Verlag New York for all her help, competent remarks, and 
the agreeable collaboration. 


March 1995 E. Hairer and G. Wanner. 


Preface to the 2nd, 3rd, and 4th Corrected Printings. These new printings al- 
lowed us to correct several misprints and to improve the text in many places. In 
particular, we give a more geometric exposition of Tartaglia’s solution of the cubic 
equation, improve the treatment of envelopes, and give a more complete proof of 
the transformation formula of multiple integrals. We are grateful to many students 
and colleagues who have helped us to discover errors and possible improvements, 
in particular R.B. Burckel, H. Fischer, J.-L. Gaudin, and H.-M. Maire. We would 
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Introduction to Analysis of the Infinite 


. our students of mathematics would profit much more from a study 
of Euler’s Introductio in Analysin Infinitorum, rather than of the available 
modern textbooks. 

(André Weil 1979, quoted by J.D. Blanton 1988, p. xii) 


... Since the teacher was judicious enough to allow his unusual pupil (Ja- 
cobi) to occupy himself with Euler’s Introductio, while the other pupils 
made great efforts .... (Dirichlet 
1852, speech in commemoration of Jacobi, in Jacobi’s Werke, vol.I, p. 4) 


This chapter explains the origin of elementary functions and the impact of Des- 
cartes’s “Géométrie” on their calculation. The interpolation polynomial leads to 
Newton’s binomial theorem and to the infinite series for exponential, logarith- 
mic, and trigonometric functions. The chapter ends with a discussion of complex 
numbers, infinite products, and continued fractions. The presentation follows the 
historical development of this subject, with the mathematical rigor of the period. 
The justification of dubious conclusions will be an additional motivation for the 
rigorous treatment of convergence in Chapter III. 

Large parts of this chapter — as well as its title — were inspired by Euler’s 
Introductio in Analysin Infinitorum (1748). 


2  L. Introduction to Analysis of the Infinite 


I.1 Cartesian Coordinates and Polynomial Functions 


As long as Algebra and Geometry were separated, their progress was slow 
and their use limited; but once these sciences were united, they lent each 
other mutual support and advanced rapidly together towards perfection. We 
owe to Descartes the application of Algebra to Geometry; this has become 
the key to the greatest discoveries in all fields of mathematics. 
(Lagrange 1795, Oeuvres, vol.7, p.271) 
Greek civilization produced the first great flowering of mathematical talent. Start- 
ing with Euclid’s era (~ 300 B.C.), Alexandria became the world center of sci- 
ence. The city was devastated three times (in 47 B.C. by the Romans, in 392 by 
the Christians, and finally in 640 by the Moslems), and this led to the decline of 
this civilization. Following the improvement of Arabic writing (necessary for the 
Koran), Arab writers eagerly translated the surviving fragments of Greek works 
(Euclid, Aristotle, Plato, Archimedes, Apollonius, Ptolemy), as well as Indian 
arithmeticians, and started new research in mathematics. Finally, during the Cru- 
sades (1100-1300), the Europeans discovered this civilization; Gerard of Cremona 
(1114-1187), Robert of Chester (XIIth century), Leonardo da Pisa (“Fibonacc1’, 
around 1200) and Regiomontanus (1436-1476) were the main translators and the 
first scientists of Western Europe. 
At that time, mathematics were clearly separated: on one side algebra, on the 
other geometry. 


Algebra 


Diophantus can be considered the inventor of Algebra; ... 
(Lagrange 1795, Oeuvres, vol.7, p. 219) 
Algebra is a heritage from Greek and Oriental antiquity. The famous book Al-jabr 
w’al mugabala by Mohammed ben Musa Al-Khowarizmi! (A.D. 830) starts by 
dealing with the solution of quadratic equations. The oldest known manuscript 
dates from 1342 and begins as follows:? 


' The words “algebra” and “algorithm” originate from Al-jabr and Al-Khowarizmi, respec- 


tively. 

> This picture as well as Figs. 1.1 and 1.2 are reproduced with permission of the Bodleian 
Library, University of Oxford, Ms. Huntington 214, folios 1R, 4R and 4V. English trans- 
lation: F. Rosen (1831). 
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Al-Khowarizmi’s Examples. Consider the quadratic equation 
(1.1) x” + 10x = 39. 


Such an equation hides the unknown solution x which is called by the arabs dshidr 
(root), a word that originally stood for the side of a square of a given surface (“A 
root is any quantity which is to be multiplied by itself”, F Rosen 1831, p.6). 


5x 
25 5x 
Manuscript of 1342 Modern Drawing 


FIGURE 1.1. Solution of x? + 10x = 39 


Solution. Al-Khow4rizmi sketches a square of side x to represent x? and two 


rectangles of sides 5 and x for the term 10x (see Fig. 1.1). Equation (1.1) shows 
that the shaded region of Fig. 1.1 is 39; consequently, the area of the whole square 
is 39+ 25 = 64=8.-8, thus5+a2 = 8 anda =3. 


Manuscript of 1342 Modern Drawing 
FIGURE 1.2. Solution of x? +21 = 10z 


With a second example (from Al-Khowéarizmi), 
(1.2) xg? +21=102 


(or, if you prefer the Latin of Robert of Chester’s translation: “Substancia vero et 
21 dragmata 10 rebus equiparantur”’), we demonstrate that different signs require 
different figures. To obtain its solution we sketch a square for x? and we attach 
a rectangle of width x and of unknown length for the 21 (Fig. 1.2). Because of 
(1.2), the total figure has length 10. It is split in the middle and the small rectangle 
(A) contained between x? and the bisecting line is placed on top (B). This gives a 
figure of height 5. The gray area is 21 and the complete square (gray and black) is 
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5-5 = 25. Consequently, the small black square must be 25 — 21 = 4 = 2-2 and 
we obtain x = 3. Using a similar drawing (you can have a try), Al-Khowarizmi 
also finds the second solution x = 7. 

Mohammed ben Musa Al-Khowdrizmi describes his solution as follows 
(Rosen 1831, p. 11): 


... for instance, “‘a square and twenty-one in numbers are equal to ten roots of the same 
square.” That is to say, what must be the amount of a square, which, when twenty-one 
dirhems are added to it, becomes equal to the equivalent of ten roots of that square? Solu- 
tion: Halve the number of the roots; the moiety is five. Multiply this by itself; the product 
is twenty-five. Subtract from this the twenty-one which are connected with the square; the 
remainder is four. Extract its root; it is two. Subtract this from the moiety of the roots, 
which is five; the remainder is three. This is the root of the square which you required, and 
the square is nine. Or you may add the root to the moiety of the roots; the sum is seven; 
this is the root of the square which you sought for, and the square itself is forty-nine. 

As an application, Al-Khowarizmi solves the following puzzle: “I have di- 
vided 10 into two parts, and multiplying one of these by the other, the result was 
21”. Putting for one of the two parts x and the other 10 — x, and multiplying them, 


we obtain 
(1.3) x-(10—2)=21 


which is equivalent to (1.2). Hence, the solution is given by the two roots of 
Eq. (1.2), i.e., 3 and 7 or vice versa. 


The Solution for Equations of Degree 3. 


Tartalea presented his solution in bad italian verse ... 
(Lagrange 1795, Oeuvres, vol. 7, p. 22) 


... have discovered the general rule, but for the moment I want to keep it 
secret for several reasons. 
(Tartaglia 1530, see M. Cantor 1891, vol. II, p. 485) 


For example, let us try to solve 
(1.4) x? + 6x = 20, 


or, in “bad” italian verse, “Quando che’! cubo con le cose appresso, Se agguaglia 
a qualche numero discreto ...” (see M.Cantor 1891, vol. II, p.488). Nicolo 
Tartaglia (1499-1557) and Scipione dal Ferro (1465-1526) found independently 
the method for solving the problem, but they kept it secret in order to win com- 
petitions. Under pressure, and lured by false promises, Tartaglia divulged it to 
Gerolamo Cardano (1501-1576), veiled in verses and without derivation (“‘sup- 
pressa demonstratione”). Cardano reconstructed the derivation with great diffi- 
culty (“quod difficillimum fuit’”) and published it in his “Ars Magna” 1545 (see 
also di Pasquale 1957, and Struik 1969, p. 63-67). 

Derivation. We represent x° by a cube with edges of length x (what else?, gray in 
Fig. 1.3a); the term 6 is attached in the form of 3 square prisms of volume x?v 
and three of volume «v? (white in Fig. 1.3a). We obtain a body of volume 20 (by 
(1.4)) which is the difference of a cube u? and a cube v? (see Fig. 1.3a), ie., 
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FIGURE 1.3a. Cubic equation (1.4) FIGURE 1.3b. Justification of (1.6) 


riz(timatio.Exemplum.cubus & 6 pofie 
tiones,equantur 20, ducito 2,tertiam pars 


imé,ad cubum,fit 8.duc10 dimidium nu — 


p:6reb’¢Glis 26 
2 


neriin fe, fit 100,iunge 100 & 8, fit 108, acci 8 “188 
peradicem que eft Fz 108,& eam gemmina Reroltp:t0 
bis ,alteri addes 10, dimidium numeri, ab pe obenite 
iterominues tantundem, habebis Binos ered 108 ‘ 

: eae 3cU. p:to 
nium R108 p:10,8 Apotomen 108 m: an revedi ioe mt 
1horum accipe Re“ cub* & minue illam ©|——————e 


FIGURE 1.3c. Extract from Cardano, Ars Magna 1545, ed. Basilea 1570° 


where 
(1.5) U=XL+v. 


Arranging the six new prisms as in Fig. 1.3b, we see that their volume is equal to 
6x (what is required) if 


(1.6) suvx = 6x or uv = 2. 


We now know the sum (= 20) and the product (= —8) of u® and —v® and can 
thus reconstruct these two numbers, as in Al-Khowarizmi’s puzzle (1.3), as 


u® = 10+ V108, —vy® = 10— V108. 


Taking then cube roots and using x = u — v we obtain (see the facsimile in 
Fig. 1.3c) 


(1.7) a = \/ V108 +10 — */ 108 — 10. 


3 Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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Some years later a method of solving equations of degree 4 was found (Lu- 
dovico Ferrari, see Struik 1969, p. 69f, and Exercises 1.1 and 1.2); the equation of 
degree 5 remained a mystery for centuries, until Abel’s proof about the impossi- 
bility of solutions by radicals in 1826. 


“Algebra Nova” 


The Numerical Logistic is the one displayed and treated by numbers; the 
Specific is displayed by kinds or forms of things: as by the letters of the 
Alphabet. (Viéte 1600, Algebra nova, French edition 1630) 


ALGEBRA is a general Method of Computation by certain Signs and Sym- 
bols which have been contrived for this Purpose, and found convenient. 
(Maclaurin 1748, A Treatise of Algebra, p. 1) 
The ancient texts dealt only with particular examples and their authors carried 
out “arithmetical” calculations using only numbers. Frangois Viéte (= Franciscus 
Vieta 1540-1603, 1591 In artem analyticam isagoge, 1600 Algebra nova) had the 
fundamental idea of writing letters A, B,C, X,... for the known and unknown 
quantities of a problem (often geometric) and to use these letters for algebraic 
calculations (see the facsimile in Fig. 1.4a). Since no problem of the Greek era 
appeared to resist the method 


put letters calculations 
Geometrical 1 Algebraic 1 Selucon 
Problem Problem 


Viéte wrote in capital letters “NVLLVM NON PROBLEMA SOLVERE” (i.e., 
“GIVING SOLVTION TO ANY PROBLEM”). The perfection of this idea led to 
Descartes’s “Geometry”. 


Exemple. 

Qu’il faille adjoufter A+ D,auec B+ 2 D, la 
fomme fera A + B + 3 D, obferuant ce quia efté 
dit. 

B +2D. 
A+ D. 


A+B+3D. 


FIGURE 1.4a. Facsimile of the French edition (1630) of Viéte (1600)* 


* Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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S A quad. + Bzin A,zquetur Zplano. A —+ B efto E. Igitur E quad. 
xquabitur Z plano + B quad. 

Confectarium. 
Traque, #/ Zplani +Bqud. — B fic A, de qua primum querebatur. 


Iraque fi A cubus— B plano; in A, equetur Z folido 2. 


<s 2 lO  O. O l—=-=aE==S—— 
VE. 2 (olidi Fyz folido-folidi —— B plano-plano-piano T Y CZfolidi-, Z tehuv-lolidi ——g plano-plano-plano. Eft ‘ 
de qua queritur. 


FIGURE 1.4b. Extracts of Viéte (1591a)° (Opera p. 129 and 150); Solution of A? +2BA= 
Z and AX? —-3BA=2Z 


Example. (Trisection of an angle). The famous clas- 
sical problem “Datum angulum in tres partes equales 
secare” becomes, with the help of 


(1.8) sin(3a) = 3sina cos? a — sin® a 


(see (4.14) below) and of some simple calculations, the 
algebraic equation 


(1.9) —-4X743X=B 
(see Viete 1593, Opera, p.290). Its solution is obtained from (1.14) below. 


Formula for the Equation of Degree 2. In Viéte’s notation, the complicated text 
by Al-Khowéarizmi (see p. 4) becomes the “formula” 


(1.10) getartb=0 = > 21,22 =—a/2+ /a2/4—b. 


Formula for the Equation of Degree 3. 


+a/3= 
yta/g=a 


(11) yi +ay?+by+c=0 a + pxe+q=0. 


We set x = u+ v (this corresponds to (1.5) with “—v” replaced by “v”’), so that 
Eq. (1.11) becomes 


(1.12) ue +2 + (3uv+p)(ut+v)+q=0. 
Putting wv = —p/3 (this corresponds to (1.6)), we obtain 
(1.13) ue +v? =-4, u’v? = —p* /27. 


By Al-Khowéarizmi ’s puzzle (1.3) and formulas (1.10), we get (see the facsimile 
in Fig. 1.4b), 


(1.14) x = \/—q/2+ 2/4 + p3/27 + \/—a/2 — \/a?/4 + p3/27. 


> Reproduced with permission of Bibl. Publ. Univ. Genéve. Here, the unknown variable is 
A. Only with Descartes came into use the choice of x, y, z for unknowns. 
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Descartes’s Geometry 


Here I beg you to observe in passing that the scruples that prevented ancient 

writers from using arithmetical terms in geometry, and which can only be a 

consequence of their inability to perceive clearly the relation between these 

two subjects, introduced much obscurity and confusion into their explana- 

tions. (Descartes 1637) 
Geometry, the gigantic heritage of Greek antiquity, was brought to Europe thanks 
to the Arabic translations. 

For example, Euclid’s Elements (around 300 B.C.) consist of 13 “Books” 
containing “Definitions”, “Postulates”, in all 465 “Propositions”, that are rigor- 
ously proved. The Conics by Apollonius (200 B.C.) are of equal importance. 

Nevertheless, different unsolved problems eluded the efforts of these scien- 
tists: trisection of the angle, quadrature of the circle, and the problem mentioned 
by Pappus (in the year 350), which inspired Descartes’s research. 


Problem by Pappus. (“The question, then, the solution of which was begun by 
Euclid and carried farther by Apollonius, but was completed by no one, is this”): 
Let three straight lines a,b,c and three angles a, 3,7 be given. For a point C, 
arbitrarily chosen, let B, D, F be points on a, b, c such that CB, CD, CF form with 
a,b,c the angles a, 3, y, respectively (see Figs. 1.5a and 1.5b). We wish to find 
the locus of points C for which 


(1.15) CB - CD = (CF)’. 


Descartes solved this problem using Viéte’s “new” and prestigious algebra; the 
point C is determined by the distances AB and BC. These two “unknown values” 
are denoted by the letters “x” and “y” (“Que le segment de la ligne AB, qui est 
entre les points A & B, soit nommé x. & que BC soit nommé y’”.) 

For the moment, consider only two of these straight lines (Fig. 1.5c) (“& pour 
me demesler de la cofusion de toutes ces lignes ...”). We draw the parallel to EF 
passing through C. All angles being given, we see that there are constants /‘, and 
Ky such that 


u = K,- CF, v= Ko-y. 
As AE=2+u+v= Kz, we get 


(1.16) CF=d+éar+ky, d, €, k constants. 
Similarly, 
(1.17) CD=mz+ny, m, n constants. 


(“And thus you see that, .. . the length of any such line... can always be expressed 
by three terms, one of which consists of the unknown quantity y multiplied or 
divided by some known quantity; another consisting of the unknown quantity x 
multiplied or divided by some other known quantity; and the third consisting of a 
known quantity. An exception must be made in the case where the given lines are 
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FIGURE 1.5b. Problem by Pappus FIGURE 1.5c. Equation of a straight line 
parallel ...” Descartes 1637, p. 312, transl. D.E. Smith and M.L. Latham 1925). 
Thus the condition (1.15) becomes 

yo (ma + ny) = (d+ bo + ky)’, 
which is an equation of the form 
(1.18) Az’® + Bry+ Cy? + Dx+ Ey+F=0. 


For each arbitrary y, (1.18) becomes a quadratic equation that is solved by alge- 
bra (see (1.10)). Coordinate transformations show that (1.18) always represents a 
conic. 


® Fig. 1.5a is reproduced with permission of Bibl. Publ. Univ. Genéve. 
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Polynomial Functions 


Algebra not only helps geometry, but geometry also helps algebra, because the 
cartesian coordinates show algebra in a new light. In fact, if instead of (1.1) and 
(1.4) we consider 


(1.19) y = x7 + 10x — 39, y = 2° + 6x — 20 


and if we attribute arbitrary values to x, then for each x we can compute a value 
for y and can study the curves obtained in this way (Fig. 1.6). The roots of (1.1) or 
(1.4) appear as the points of intersection of these curves with the x-axis (horizontal 
axis). For example, we discover that the solution of (1.4) is simply x = 2 (a bit 
nicer than Eq. (1.7)). 


FIGURE 1.6. Polynomials «” + 10a — 39 and 2° + 6a — 20 


(1.1) Definition. A polynomial is an expression of the form 
Y = Gne”™ +an—1a" 14+... +49, 


where do, @1,...,@y are arbitrary constants. If a, # 0, the polynomial is of 
degree n. 


Interpolation Problem. Given n + 1 points x;, y; (see Fig. 1.7), we look for a 
polynomial of degree n passing through all these points. We are mainly interested 
in the situation where the x; are equidistant, and in particular where 


xo = 0, zr, = 1, LQ = 2, x3 = 3, oer oe 


The solution of this problem, which was very useful in the computation of loga- 
rithms and maritime navigation, emerged in the early 17th century from the work 
of Briggs and Sir Thomas Harriot (see Goldstine 1977, p.23f). Newton (1676) 
attacked the problem in the spirit of Viéte’s “algebra nova” (see Fig. 1.8): write 
letters for the unknown coefficients of our polynomial, e.g., 


(1.20) y = A+ Ba + Ca? + Dz’. 
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FIGURE 1.7. Interpolation polynomial 


Abiciffx Ordinatz 
A+p | A+bp+ op + dps + opt =a 
A+ A+ bq cq? + dq + eqt=8 
A+r | A+ br + cr? + dri + ert+=y 
A+s Atb tee +tdsiteas=y 
A+t At bt+ ct? + dti+ a+=¢ 


Divifor. Diff. Ord. Quoti per divifionem prodeuntes. * 
P—9 eB jb-+cxp+atdxpp+ pat qt exp? + pg + pe +g= 
9—") Boy [b+exgtrtdxgqtortrtexgtgrt+gr+ri= » 
r—s)y—F lb+cxrpatdxnteasy stexntrspepsisg 
s—t)t—e lb+texspitdx s+ st ttp+exs + t+ se +tic » 
c+ dxp+ qtr + exppt pataqt prt grtir=a 
etdxgtr+stexqt ge tr+gs+rs+ss=yp 
ctdxrtstt texrr+rs +ss+ rt st +tt=yr 
P—s) ame |[dtexp+atrt+s =& 

q—t) wmv ldtexgtrt+stt=-. 


p—t) t—a jee. 


FIGURE 1.8. Problem of interpolation by Newton (1676, Methodus Differentialis)' 


The values yo, yi, y2, y3 having been given, we transform the “problem” into “al- 
gebraic equations” 


Abscisse Ordinate 


L= A = Yo 
c=1 A+B+C+D = Y1 
e=2 A+2B+44C+8D =y 
c= 3. A+3B+9C4+27D =y3 


7 Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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Here, we notice that the value A disappears if we subtract the equations, the 1st 
from the 2nd, the 2nd from the 3rd, the 3rd from the 4th: 


B+C+D=yi — yo =: Ayo 
(1.21) B+3C+7D=y.-y1 =: Ay 
B+5C + 19D = yz — yo =: Ayo. 


B disappears if we subtract once again: 


2C + 6D = Ay, — Ayo =: A? yo 


(1.22) 2 
2C +12D = Ayo — Ay, =: A*y1, 


and then so does C: 
(1.23) 6D = A?y, — A? yo =: A® yo. 


This gives us D. Then the first equation of (1.22) yields C, the first of (1.21) the 
value B. We arrive at the solution 


Le we 
(1.24) y=yotAyo:rt+ = (x? — a2) + ns - (ag? — 3a? + 22), 


which can also be written as 


x a(x—1 a(x —1)(~4 —2 
(1.24") Y = yo + — Ayo + ( ) Ary + He-Ne =?) 


AG eeL Ab yo. 
1 fae #0 


We will see in the next paragraph, using Pascal’s triangle, that this is a particular 
case of a general formula for polynomials of any degree. 


(1.2) Theorem. The polynomial of degree n taking the values 


yo (fora =0), ys (fora =1),..., yn (fora =n) 
is given by the formula 


a(a@—1)...(e—n+1) 


x a(a—1 
y = yo + ~Ayo + ( D APig si oth 
1 1-2-...-7 


aa A" yo. 
1-2 Ho 


(1.3) Remark. Since Newton (see Fig. 1.9), it is usual to arrange the differences in 
the scheme 


Yo where 

y Fa A’y Ay = y y 

i = Vit — Yi 

1 Ay : 0 AS yp : : + 

(1.25) ye i A*yt AB A*yo Ay; = Ayisi — Ay: 
Y2 0 A2 Wy Adie = sAzaig = Are 
Y¥3 Y2 Y= Yi4+1 Vis 
Ays 


YA etc. 
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AB— A2B2 A2B2 — A3B3 
Et fac SO = b, ST = ba, 


AaBz — ASB4 __ AgB4—ASBs — 
at = 53, 4, 


A4ZA5 
ABs MEE = by, MS 6, ‘on : 
= Azer = A688 = by. #, 
Deinde =~ =¢, a = 2, at = ¢3, &c. 
Tune So = 4, Say = 42 Tas = GB, &e. 
Et 2 = @, en8 = ¢2, 53 = e3, &e. 


Sic pergendum eft ad ultimam differentiam. 


FIGURE 1.9. Newton’s scheme of differences (Newton 1676, Methodus Differentialis)® 


Example. For the values of our problem (Fig. 1.7), we obtain 


4 

5 4 =4 s ‘ (a —1)+ 

2 6 ~~ —~22 3 
- Sipe ek egg y OIE at tie 

5 = 21 30 24 

9) —3 3 9 9 a4 43 x 

: 0 2 120 


Other Examples. a) We consider the polynomial y = x° for which we already 
know the solution. The scheme of differences yields 


_. —1 
zx=0: O ‘ y=0tl-2+6- 20 ) 
on (w— 1)(w- 2) 
= a(x —1)(a— 
g=2: 8 | pa 
19 
G=3: 27 =24+ 32? —3r4+2° — 3a? +227 = 2°. 


b) Here, the values for x = n are the sums 1? + 2? 4+....+ n°, 


xr=0: 0 
Ll 
cd: 13 7 
23 12 
2" 13 4 23 19 6 
33 18 0 
PSs 1208-434 37 6 0, 
2 24 0 
e=4: 1842%+43%+4 48 61 6 


8 Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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and we obtain the formula 


x(a — 1) x(a — 1)(x — 2) x(a — 1)(x — 2)(a% — 3) 
= 7 —— _ + 12 ——__—_—_ + 6 2 J_—_—_ 
ee ear as 6 = 24 
ct gf lg? 
Similarly, we obtain 
n> on 
142+... =—+H- 
+2+...+7 5 5 
ie ee wae ea 
3 2 6 
4 3 2 
1.26 a gE ae 
( ) n m 5 m 0 
5 4 3 
10 ay ee a ae 
ea 8 30 
6 5 5n4 n2 
124 OP hy a é 
eG oe aa 
Jacob Bernoulli (1705) found the general formula 
nati ont g q(q— 1)(q— 2) 
194274 ...4n9= ng Saget ae 
i a Care sor a 2.3.4 nee 
aa — 1)(a- 2)(a—- 3)(a- 4) 2 gs 
2-3-4-5-6 tas 
where 
1 1 1 1 +) 691 
(29) Ae Bi! ee ae Pt pt Oe 
( ) 6’ 30’ 42’ 30’ 66’ 2730’ 


are the so-called Bernoulli numbers. For an elegant explanation see Sect. II.10 
below. 


Exercises 
1.1 The following problem, in Viéte’s notation, 


xty+z=20 
Liy=yiz2 
ry =8 


was proposed the 15th of December 1536 by Zuanne de Tonini da Coi (Colla) 
to Tartaglia, who could not solve it (see Notari 1924). Eliminate the variables 
x and z and understand why. Cardano later handed the problem over to Fer- 
rari who found the solution (see next Exercise). It is not astonishing that later 
Ferrari and Tartaglia exchanged ugly letters with heated disputes on mathe- 
matical questions. 


1,2 


1.3 


1.4 


1.5 


I.1 Cartesian Coordinates and Polynomial Functions 15 


Reconstruct Ferrari’s solution of the biquadratic equation 
(1.28) a! + ax? =brte. 


Hint. a) Add a? /4 on both sides to obtain 


2 
(2? +5)’ =bet+e+ = 


b) Take y as a parameter and add y? + ay + 2xy on both sides to obtain 


Dw 2 2 2 a” 
(a te e4) =2a*yt+bat+y tayte+—. 


c) The expression to the right, when written as Ar? + Bx + C, is of the form 
(ax + 3)? if B? = 4AC. This leads to a third order equation for y. 
d) Having found a y satisfying this with Cardano’s formula (1.14), you obtain 


(a? + 5 +y) = +(ax + f) 


with two roots each. 
Remark. Every polynomial z+ + az® + bz? + cz +d = 0 can be reduced to 
the form (1.28) by the transformation x = z+ a/4. 
(Euler 1749, Opera Omnia, vol. VI, p. 78-147). Solve the equation of de- 
gree 4 

a! + Ba? +Cr+D=0 


by comparing the coefficients in 
v’+ Br? +Cr+D = (a? + ur+a)(x? — uxt 8) 


and finding an equation of degree 3 for u?. Solve this equation and compute 
the solutions of two quadratic equations. 


(L. Euler 1770, Vollst. Anleitung zur Algebra, St. Petersburg, Opera Omnia, 
vol. I). Consider an equation of degree 4 with symmetric coefficients, e.g., 


(1.29) vt +503 +807 +5¢2+1=0. 


Decompose the polynomial as (x? + ra + 1)(a? + sx +1) and find the four 
solutions of (1.29). 

Remark. Another possibility for the solution of (1.29) is to divide the equa- 
tion by x? and to use the new variable u = 2+ 271. 


Problem proposed by Armenia/Australia for the 35th international mathemat- 
ical olympiad (held in Hong Kong, July 12-19, 1994). ABC is an isosceles 
triangle with AB = AC. Suppose that (i) WV is the midpoint of BC and O is 
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the point on the line AM such that OB is perpendicular to AB; (ii) Q is an 
arbitrary point on the segment BC different from B and C; and (iii) EF lies on 
the line AB and F lies on the line AC such that E, Q, and F are distinct and 
collinear. Prove, with Viéte’s method, that O@ is perpendicular to EF if and only 


if QE = QF. 


Ren Cartesius 


R. Descartes 1596-16509 I. Newton 1642-17279 


Summa Poteftatum, 


fa Dinn+tin, | 

fan 2 fw +ina + om 

fi 20 4n* +50 + ann. 

fot 20 tn 4int + oP kos, 

fas 0 bn +n lint kK Tinn 

(nS 20 En? ind + 2M K— Fk RM. 

fa? KO in bon? + on® Kost kt, 

fa® 20 50? bam + tn? Kem Kb SB OK —FbM. 


fu) Deweoarsw pw Kn kp Ent K— ne 


[O° DAA UR EW OP Fn) Km 17 Kb IM Km EW Kem 


Quin imd qui legem progreffionis inibi attentius infpexerit , eundem 
etiam continuare poterit abfq; his ratiociniorum ambagibus : Sumta 
enim ¢ pro poteftatis cujuslibet exponente, fit fumma omnium n¢ fu 
1 c cm. C.Cm™ 1.6mm 3 ee} 
c —— nei I yc 2 pate veel deaea 
fie 0 =," + in -+ 5 An + eer Bu + 
C.Cw LL Cm 2.6m 36M 4 c—5§ CiCm LC 2. Com 3 Cm 4 cH 5 CE 
oo Cn se eR a a ads, ae ee 
2.3.4+5.6 Cc + 2.35% +§ +627. 8 
Dav-7,....& 


Jac. Bernoulli, Ars conj. 1705° 


° These figures are reproduced with permission of Bibl. Publ. Univ. Genéve. 
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[.2 Exponentials and the Binomial Theorem 


Here it will be proper to observe, that I make use of al, x, x, x, 


&c. for 4, =. =. a. &e. of x2, axe, axe, ae ae, &e. for fz, Vz}, V2, 
Ve, Vx2, & and of at, uns, 2-1 &c. for ee Smet ce &c. And 
this by rule of Analogy, as may be apprehended from such Geometrical 
Progressions as these; x, x2, x, v2, 4, x, x, (or 13) xt, xt, ae 
x, &e. (Newton 1671, Fluxiones, Engl. pub. 1736, p. 3) 


For a given number a, we write 


2.1 a-a=a? a-a-a=a a-a-a-a=a‘ 
> ’ ’ 


This notation emerged slowly, mainly through the work of Bombelli in 1572, Si- 
mon Stevin in 1585, Descartes, and Newton (see quotation). If we multiply, e.g., 


a’ -a? =(a-a)-(a-a-a)=a-a-a-a-a=a’, 


we see the rule 
(2.2) a®.g™ =ar™, 


In the geometric progression (2.1), every term is equal to its predessessor multi- 
plied by a. We can also continue this sequence fo the left by dividing the terms by 
a. This leads to 


a*=—, ao=- a&=l1 
a:a a 


where we have used the notation 
(2.3) "==. 


In this way, formula (2.2) remains valid also for negative exponents. Next, mul- 
tiplying 1 repeatedly by \/a (where a has to be a positive number), we obtain a 
geometric progression 


1, Va, Va-Ja=a, Va-Va-JVa=va3, Vat=a’, ..., 
which suggests the notation 
(2.4) al” = Van, 


Now formula (2.2) remains valid for rational exponents. We take only the positive 
roots, so that a®/? lies between a? and a®. The last step (for mankind) is irrational 


exponents, which are, as Euler says, “more difficult to understand”. But “Sic av? 


erit valor determinatus intra limites a? et a? comprehensus”, tells us that av” is 
a value between a2 and a®, between a2°/!9 and a?7/!°, between a264/10 and 
265/100 between a2645/1000 and @2646/1000 and so on. 
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Binomial Theorem 


Although this proposition has an infinite number of cases, I shall give quite 
a short proof of it, by assuming 2 lemmas. 
The Ist, which is self-evident, is that this proportion occurs in the second 
base; for it is quite obvious that vy is to o as 1 is to 1. 
The 2nd is that if this proportion occurs in some base, it will necessarily be 
true in the next base. 

(Pascal 1654, one of the first proofs by induction) 


We wish to expand the expression (a + b)". Multiplying each result in turn by 
(a + b) we obtain, successively, 


(a+b)? =1 
(a+b)'=a+b 
(2.5) (a +b)? =a? + 2ab +b? 
(a +b)? = a? + 3076 + 3ab? + b° 
(a + b)* = a4 + 4a%b + 6a7b? + 4ab? 4+ 04, 


and so on. There appears an interesting triangle of “binomial coefficients” (Omar 
Alkhaijama in 1080, Tshu shi Kih in 1303, M. Stifel 1544, Cardano 1545, Pascal 
1654, see Fig. 2.1) 


1 
1 1 
1 2 1 
1 3 3 1 
(2-8) 1 4 6 4 1 
1 5 10 10 5 1 
1 6 15 20 15 6 1 
1 7 21 35 35 21 7 1 


in which each number is the sum of its two “superiors”. We want to find a general 
law for these coefficients. It is not difficult to see that the first diagonal in this 
triangle is composed of “1” and the second (1,2,3,...) of “n”. For the third 


diagonal (1,3,6,10,...) we guess “ na ” followed by “ nln?) ” and 
so on. This suggests the following theorem. 


(2.1) Theorem (Pascal 1654). Forn = 0,1,2,... we have 


non Mm n-1 n(n — 1) n—2},2 n(n — 1)(n — 2) n—3 3 
(a+b)" =a +74 b+ ae bv + oe alia a ee 


This sum is finite and stops after n + 1 terms. 


Proof. We compute the ratio of each number in (2.6) with its left-hand neighbor 
(Pascal 1654, p.7, “Consequence douziesme’’). 
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Crang le A rithmelique 


seLiopnonp wad sad sue YB 


FIGURE 2.1. Original publication of Pascal’s triangle, Pascal (1654)! 


» L 
ra aoe | 
a ee eee | 
1 2 4 
(2.7) 5 4 Be ae 1 
6 I 5 7 4 3 3 7 2 > 1 
t 3 3 7 5 é 
z 6 5 > 4 ee | 
al 2: 3 4 5 6 7 


Here, it is not difficult to guess a general law. We prove this law “by induction on 
the row-number” (see quotation). Suppose that 


(2.8) . e ° D=A+B, E=B+C 
D E 7 = 


is a part of Pascal’s triangle with the “induction hypothesis” 


Bk CC k-1 
A €-1? Bet 
Then, 
(2.9) E B+C 1+§ 14+ &' ok 
Do ARB 234 elegy eee 


' Fig. 2.1 is reproduced with permission of Bibl. Publ. Univ. Genéve. 
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which means that the same structure is also found in the next line. 

The fact that the ratios in the nth row of (2.7) are given by n/1, (n — 1)/2, 
(n — 2)/3,... implies that the coefficients of (2.6) are a product of such ratios; 
e.g., the “20” in the 7th line is the product 


and we see that Theorem 2.1 is true in general. 


These coefficients 


n(n—1)...(n—jg+1) n(n—-1)...(n—g+1)(n—-9)...1 


1-2-...+9 1-2-...-9-1-2-...-(n-9 

(2.10) j j (n— j) 
7 n) - @ 
(n—j)EMG 

are called binomial coefficients and n! = 1-2-...+n is the factorial of n. 


Application to the Interpolation Polynomial. Expand the expressions in the dif- 
ference scheme (1.25): 


Yo 
Y1 — Yo 
YI y2 — 2y1 + Yo 
y2- V1 y3 — 3y2 + 3y1 — Yo - 
Yy2 y3 — 2y2 + yt 
Y¥3 — Y2 
Y3 


The appearance of Pascal’s triangle is not a coincidence, because each term is the 
difference of the two terms to its left. 

Furthermore, each term of the scheme (1.25) is the sum of the term above it 
with the term to its right. Consequently, the scheme can also be written as 


Yo 
Ayo 
yo + Ayo A? yo 
Ayo + A? yo Ab yo . 
yo + 2Ayo + A* yo A’ yo + Ayo 


Ayo + 2A yo + A® yo 
yo + 3Ayo + 3A? yo + A? yo 


Pascal’s triangle appears again. Formula (2.10) thus yields 


(n 


-1 
BEN As 


n n n(n —1)(n—-2 


3] A®yo t+... , 


and this proves Theorem 1.2. 
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Negative Exponents. We begin with 


1 


a 
(a + b) seer 


If we assume that |b| < |a|, a first approximation to this ratio is 1/a. We try to 
improve this value by an unknown quantity 6, 


1 1 b 
—_ = B41 Gg = 1=14+-—+4+46 + b6. 
a+b a a 
Since |b| < |a|, we neglect the term bd and obtain 6 = —b/a?. Repeating this 


process again and again (or, more precisely, proceeding by induction), we arrive 
at 


(2.11) (a+b) $=-- 545-4, 
which is the same as Theorem 2.1 for n = —1. This time, however, the series is 


infinite. 
If we multiply (2.11) by a and put x = b/a, we obtain 


1 
(2.12) Tag timate ai tata +... jz] <1, 


the famous geometrical series (Viéte 1593). 


Square Roots. Next, we consider (a+b)!/? = /a +b. We again suppose b small, 
so that a + 6  \/a, and search for a 6 such that 


Vat+tb=Vat+6 
is a better approximation. Then, 
atb=(Vat6)? =a+2Vad +6. 


As 6 is small, we neglect 5? and have 6 = b/(2,/a). Consequently, 


b 
(2.13) Vat ae lb] <a. 


Example. Computation of \/2. We start from an approximate value v = 1.4 and 
seta = v7, b = 2 —a=2-—v”. Then, (2.13) gives as a new approximation 


+2o¥ at (240) 
i i. De 


a formula that can be applied repeatedly and yields 
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1.4 

1.414285 

1.4142135642 

1.4142135623730950499 

1.4142135623730950488016887242096980790 

1.41421356237309504880168872420969807856967 1875376948073 17667973799 . 
The same calculation performed in base 60 starting with 1, 25 gives 1, 24,51, 10 
(commas separate digits in base 60), a value found on a Babylonian table dating 
from 1900B.C. (see Fig. 2.2, see also van der Waerden 1954, Chap. II, Plate 8b). 
This indicates that formula (2.13) has been in use since Babylonian and Greek 


antiquity. 


FIGURE2.2. Babylonian cuneiform tablet YBC 7289 from 1900 B.C. representing a 
square of side 30, with diagonal given as 42, 25, 35 and ratio 1, 24,51, 10° 


Next Step (Alkalsadi around 1450, Briggs 1624). To improve (2.13), consider 


b 
= —=+06 
Vatb=VJat 5 a 
compute the square 
b? bd 
+b=a+b4+ —+4+2V/ad5+ —~—+46', 
. ‘ 4a ve Ja 
neglect the last two terms, and obtain 
a+tobxe Jat sale as, 
aan 2a eva 


Example. For ./2, we obtain this time as new approximation 
2—v? ASME 3g 3 1 
Qu 8v3 ~ 8 Qu Qv3? 


Ut 


> Reproduced with permission of Yale Babylonian Collection. 
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the repeated use of which, starting with v = 1.4, gives rapid convergence: 


1.4 

1.4142128 

1.41421356237309504870 

1.41421356237309504880 168872420969807856967 1875376948073 17643 . 


Equations (2.13) and (2.14) become noticeably neater if we divide them by 
/a and if b/a is replaced by z: 


(l+a)Fwlts, G4a\t a4 = 


In order to obtain a more precise approximation, we can continue the above cal- 
culations. The result will be a series of the type 


(142)? =145+be?+er%+de4+..., 


whose coefficients b, c, d,... we want to determine. Inserting this series into the 
relation (1 + x)2(1 +x)? = 1+ 2 and comparing equal powers of x yields 
b = —1/8, c = 1/16, d = —5/128,... . Consequently, we have the better ap- 


proximation (Newton 1665) 


1 1 5 
Go St Sg gt 


; 1 
eae en pee 
G19) St Beha” — aos 


We note that 


1 1-1. $(3-1 1 1-1-3 $(§-1)(5 -2) 
8 2-4 27” 16 2-4-6 1-2-3 ; 
5b 18-5 $G-1G-2)G-3) 
128 2:4-6-8 1-2-3-4 
which leads to the conjecture that Theorem 2.1 is also true forn = 1/2. The 


sequence 1 + 2/2, 1+ 2/2 —2?/8,... sketched in Fig. 2.3, illustrates the con- 
vergence of (2.15) toward /1+ 2 for—l<a<l. 


Arbitrary Rational Exponents. 

All this was in the two plague years of 1665 and 1666, for in those days 

I was in the prime of my age for invention, and minded mathematics 

and philosophy more than at any other time since. 

(Newton, quoted from Kline 1972, p. 357) 

One of Newton’s ideas of these “anni mirabiles”’, inspired by the work of Wallis 
(see the remark following Eq. (5.27)), was to try to interpolate the polynomials 
(1+ 2)°, (1+), (1+2)?,... in order to obtain a series for (1 + «)* where a is 
some rational number. This means that we must interpolate the coefficients given 
in Theorem 2.1 (see Fig. 2.4). Since the latter are polynomials in n, it is clear 
that the result is given by the same expression with n replaced by a. We therefore 
arrive at the general theorem. 
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14+2 


i 72 
128 


(2.2) Theorem (Generalized binomial theorem of Newton). For any rational a we 
have for |x| <1 


- a a(a—1) 4 al(a—1)(a—-2) 5 
1 =l+- ——_———_— S Pes 
(1+) + yet 5 xt Tos x? + 


eee 


FIGURE 2.4. Interpolation of Pascal’s triangle, Newton’s autograph (1665)° 


Even Newton found that his interpolation argument was dangerous. Euler, 
in his Introductio (1748, §71), stated the general theorem (‘ex hoc theoremate 
universali”) without any further proof or comment. Only Abel, a century later, felt 
the need for a rigorous proof (see Sect. III.7 below). 


3 Fig. 2.4 is reproduced with permission of Cambridge University Press. 
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Remark. This is precisely the formula that was engraved on Newton’s gravestone 
in 1727 at Westminster Abbey. Don’t make useless efforts .. . for the past hundred 
years the formula has been illegible. 


Exponential Function 


... ubi e denotat numerum, cuius logarithmus hyperbolicus est 1. 

(first definition of e; Euler 1736b, Mechanica, p. 60) 
Origins. 1. F. Debeaune (1601-1652) was the first reader of Descartes’ “Géomé- 
trie” of 1637. A year later, he posed Descartes the following geometrical problem: 
find a curve y(x) such that for each point P the distances between V and T, the 
points where the vertical and the tangent line cut the x-axis, are always equal to 
a given constant a (see Fig.2.5a). Despite the efforts of Descartes and Fermat, 
this problem remained unsolved for nearly 50 years. Leibniz (1684, “.. . tentavit, 
sed non solvit’”) then proposed the following solution (see Fig. 2.5b): Let x, y be 
a given point. Then, increase x by a small increment b, so that y increases (due 
to the similarity of two triangles) by yb/a. Repeating, we obtain a sequence of 


values 
b b\2 b\3 
Y, (1+-)y. (1+-) Ys (1+-) Yee 
a a a 


for the abscissae x, x + b, x + 2b, 7 + 30,.... 


f ¥ . “bb bbb D 


FIGURE 2.5a. Debeaune’s problem FIGURE2.5b. Leibniz’s solution 


2. Questions like “If the population in a certain region increases annually by 
one thirtieth and at one time there were 100,000 inhabitants, what would be the 
population after 100 years?” (Euler 1748, Introductio §110) or “A certain man 
borrowed 400.000 florins at the usurious rate of five percent annual interest ...” 
(Introductio §111) lead to the computation of expressions such as 


1 \ 100 N N 
(2.16) (1 + = , (1 + 0.05) , or in general (1 + w) ; 


where w is small and JN is large. 
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Euler’s Number. Suppose first that w = +. We compute (2.16) with the help of 
Theorem 2.1, 
1\"_ . N. NN-D 1, NN-1(N-2) 1 

(1+) Tyee NAT en NE 
1d- 7%)  1a-#d- #) 
=qno 8 =p ones 
Here, Euler states without wincing that “if N is a number larger than any 

N-1 


assignable number, then === is equal to 1”. This shows that as N tends to in- 


finity, (1 + 4) tends to the so-called Euler number 


=1+1+ 


1 1 1 
aie eee 
(217) STAGES Tg osa esd © 


We emphasize that this argument is dangerous, because it is applied infinitely 
often. For example, by a similar “proof” we would obtain 


1 1 1 1 1 1 1 


We shall return to this question in Sect. III.2. Table 2.1 compares the convergence 
of the series with that of (1+ +)%. 


TABLE2.1. Computation of e 


N (+7)% l+ttat..tt 
1 2.000 2.0 

2 2.250 2 

3 2.370 2.66 

4 2.441 2.708 

5 2.488 2.7166 

6 2.522 2.71805 

7 2.546 2.718253 

8 2.566 2.7182787 

9 2.581 2.71828152 


10 2.594 2.718281801 

11 2.604 2.7182818261 

12 2.613 2.71828182828 

13 2.621 2.71828 1828446 

14 2.627 2.7182818284582 

15 2.633 2.71828182845899 

16 2.638 2.7182818284590422 

17 2.642 2.71828182845904507 

18 2.646 2.71828 1828459045226 

19 2.650 2.71828 18284590452349 

20 2.653 2.71828 1828459045235339 

21 2.656 2.71828 18284590452353593 

22 2.659 2.71828 1828459045235360247 

23 2.661 2.7182818284590452353602857 

24 2.664 2.718281828459045235360287404 

25 2.666 2.7182818284590452353602874687 
26 2.668 2.71828 182845904523536028747125 
27 2.670 2.718281828459045235360287471349 
28 2.671 2.71828182845904523536028747 135254 
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6 

6 ie : 
446: 
2- Y/N=1 

| | 

1 2 

— 2 
3 
N 2 3 
FIGURE2.6a. (1+ +) FIGURE2.6b. l+2+2-++... 


Powers of e. We next set w = «/N in (2.16), where x is a fixed, say rational 
number. That is to say that we simultaneously let N tend to infinity and w to zero 
in such a manner that their product remains equal to the constant x. Exactly the 
same manipulation as above now leads to the result 


r\N x 2 a 
2.1 (1 =) 1 — + —— + ———__ +... . 
a) PM tp ore  ipoaaea T 
On the other hand, we set M@ = N/ax, N = xM for those values of N such that 
M is an integer. This gives, for N and M tending to infinity, 


(2.19) (+5) = (+5) =((1t+5) ) oe. 


On combining (2.18) and (2.19), we have the following theorem. 


(2.3) Theorem (Euler 1748, Introductio §123, 125). For N tending to infinity, 


x? x x 
a ar att ee 


ere 1 
) —s v 
( + —e + + 


The convergence of these expressions to e” (also denoted by exp 2) is illus- 
trated in Figs. 2.6a and 2.6b. The dotted line represents the exact function e”. 
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Exercises 


2.1 Verify the following formula (Euler 1755, Opera vol. X, p.280) by using 
50 = 2-57 =77 +1: 
7 1 1-3 1-3-5 
2=2(1+ + + Dt te. ) 
V2= (1+ 795 + qo0-200 + To0- 300-300 + 


“quae ad computum in fractionibus decimalibus instituendum est optissima”’. 
Add numerically five terms of this series. 
Hint. Work with the series for (1 — x)~!/?. 


2.2 Show that the number, written in base 60 as 1, 25, is a good approximation 
to \/2. Show that one iteration of the “babylonian square root algorithm” 
deduced from formula (2.13) leads to 1, 24,51, 10,..., the value of Fig. 2.2. 


2.3 By multiplying the series 
(1+2)/3 =1+ar4+ba*+cx? 4+... 
with itself twice, determine the coefficients a, b, c,... to find 
2 2-5 
1 1/3 _ =] ee Se ee yk 
(14+ 2) +5 3.6" +3609" 
By using 2 - 4° — 5° = 3, obtain the formula 


3 5 1 2 
Sy es 
ce ( 1-125 1-2-(125)2 


225 568 i: 
{2-Be (ies Toe 7 4-(125)4 °°) 


Remark. The determination of </2 was one of the great problems of Greek 
mathematics (double the volume of the cube). 


2.4 (Bernoulli’s inequality; Jac. Bernoulli 1689, see 1744, Opera, p. 380; Barrow 
1670, see 1860, Works, Lectio VU, §XIII, p. 224). By induction on n, prove 


that 

(l+a)" >1+na for a>-1, n=0,1,2,... 
1 

l—na<(l-a)"< for O<a<1, n=2,3, 
1+na 


2.5 In order to study the convergence of (1 + 1)" to e, consider the sequences 


1\” 1\nt1 
— (1 4 —) Fr ne oe (1 ns =) 
n 


n 
Show that 


ay <ag<ag<...<e<... << bg < bg < by 


and that b,, — a, < 4/n. 
Hint. Use the second inequality of Exercise 2.4 with a = 1/n?. 
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1.3 Logarithms and Areas 


Tabularum autem logarithmicarum amplissimus est usus .. . 
(Euler 1748, Introductio, §110) 


Students usually find the concept of logarithms very difficult to understand. 
(B.L. van der Waerden 1957, p. 1) 


M. Stifel (1544) highlights the two series (see facsimile in Fig. 3.1) 


ArRiTHMBTICAE LiserR Wh 237 


lone, ut plene oftendi lib.1, capite de geomet.progref, 
Vide ergo, 

o, Te 2 3° 4- S- 6. 7 8. 

1. 26 gu Be 6, 320 6G, 128. 256, 
Sicutex additione(in fuperiore ordine)3 ad 5 flunt 8.ficCin ine 
feriore ordine)ex multiplicatione 8 in 32 flunt 256.Eftautem 
3 exponens ipfius octonarij , & 5 eft exponens numeri 32.& 8 
eftexponensnumerizs6, Item ficutin ordine fuperiori ,ex 
fubtratione 3 de 7,remanent 4.{ta in inferiori ordine ex diui- 
fione 128 pér 8,fiunt 16. 


Sed oftendenda eft ifta fpeculatio per exemplum.. 
I-3|-2|-] of +] 213] 4is] 4 


al 2] st a) 2] 4] 8116132164 


FIGURE 3.1. Extracts from Stifel’s book (p. 237 and 250)' 


We see that passing from the lower to the upper line transforms products into sums. 
For example, instead of multiplying 8 by 32 “in inferiore ordine”’, we take the cor- 
responding “logarithms” 3 and 5 “in superiore ordine”’, compute their sum which 
is 8, return from there “in inferiore ordine’’, and find the product 8 - 32 = 256. 
A more detailed table of this type would be of great use since additions are eas- 
ier than multiplications. Such “logarithmic” tables (Adyos is Greek for “word, 
relation”, ao6j16¢ means “number”, logarithms are therefore useful relations be- 
tween numbers) were first computed by John Napier (1614, 1619), Henry Briggs 
(1624), and Jost Biirgi (1620). 


(3.1) Definition. A function (x), defined for positive values of x, is called a 
logarithmic function if for all x,y > 0 


(3.1) E(x -y) = (a) + L(y). 


' Reproduced with permission of Bibl. Publ. Univ. Genéve. 


30 IL. Introduction to Analysis of the Infinite 


If we set first y = z/a and then x = y = 1 in (3.1), we obtain 


(3.2) e(z/x) = (2) — (a), 


(3.3) é(1) = 0. 
Applying (3.1) twice to x- y- z = (a- y) - 2 gives 
(3.4) l(a-y-z) = l(a) + L(y) + &(z), 


and similarly for products with four or more terms. Next, applying (3.4) to ¥/z - 
Va - Wx = x, we obtain ¢(¥/x) = 2(x), or in general 


(3.5) at) =" (x), where 2® = Yar 


n 


Bases. Let a fixed logarithmic function ¢(a) be given and suppose that there exists 
a number a for which ¢(a) = 1. Then, (3.5) becomes 


(3.6) (ar) = —, 


i.e., the logarithmic function is the inverse function for the exponential function 
a”. We call this the logarithm to the base a and write 


(3.7) y = log, # if z=a’. 


Logarithms to the base 10 (Briggs’ logarithms) are the most suitable for nu- 
merical computations, since a shift of the decimal point just adds an integer to 
the logarithm. The best base for theoretical work, as we soon shall see, is Euler’s 
number e (natural or Naperian or hyperbolic logarithms). These logarithms are 
usually denoted by In x or log x. 


Euler’s “Golden Rule”. If the logarithms for one base are known, the logarithms 
for all other bases are obtained by a simple division. To see this, take the logarithm 
to the base b of x = a¥ and use (3.7) and (3.5). This yields 


log, x 


(3.8) log, x = y- log, a => y =log, t= ; 
log, a 


Computation of Logarithms 


By computing the square root of the base a, then the square root of the square 
root, and so on, and by multiplying all these values, we obtain, with the help of 
(3.6) and (3.1), the logarithms of many numbers. This is illustrated for a = 10 in 
Fig. 3.2. 
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1.00 = @ 
Numbers Logarithms =o 7 
10.0000 1. 5 ae 
7.4989 0.875 wo 
5.6234 0.75 
4.2170 0.625 50 
3.1623 0.5 
2.3714 0.375 
1.7783 0.25 25 
1.3335 0.125 
1.0000 0. 
00 1 104 10'2 | 1034 | 10 


FIGURE3.2. Successive roots of 10 and their products 


There remains a problem: we would prefer to know the logarithms of such num- 
bers as 2, 3, 4,... and not of 4.2170 or 2.3714. 


Briggs’ Method. Compute the root of 10, then the root of the root, and continue 
doing so 54 times (see facsimile in Fig. 3.3). This gives, with c = 1/2°4, 


(3.9a) 10° = 1.00000 00000 00000 12781 91493 20032 35 = 1+ a. 
Then, compute in the same way the successive roots of 2: 
(3.9b) 2° = 1.00000 00000 00000 03847 73979 65583 10 = 1 + b. 


The value x = logy, 2 we are searching for satisfies 2 = 10”. Hence, 


Ob ; Theorem 2.2 
1+b GZ ) 25 =(105)” (324) (1+a)*” ( ~ ) l+az 
and we obtain 
b 3847739796558310 
ll l 2)=223-2= —  _ & 00..30102 hs 
(3.10) logio(2) = # © © = Torez9 14932003235 ~ 030102999566388 


This gives us one value. The amount of work necessary for the whole table is 
hardly imaginable. 


Interpolation. Interpolation was an important tool for speeding up the compu- 
tation of logarithms in ancient times. Say, for example, that four values of log, 
have been computed. We compute the difference scheme 


log(44) = 1.6434526765 

0.0097598373 

log(45) = 1.6532125138 —0.0002145194 

0.0095453179 0.0000092277. 
log(46) = 1.6627578317 —0.0002052917 

0.0093400262 


log(47) = 1.6720978579 
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E 


D 
Noameri continue Medi inter Denarist & Vaitate. Logarithmsi ratiowales. 


Yo 1,000 
3162 2,77660,168 37,93 319,98893,54 0,50 : 
17782,79410,03 892,28611,97304,13 0,25 
13 335,2143 2516 33 2,40256,65 389,308 0,125 
11547,81984.68945 ,81796,61918,213 0,0625 
10746307828, 32131574972,13817,65 38 0,031 25 
103 66,32928;43769;7997 2:90627,313 £ 0,01562,5 |... 
10181,51721,71818,1841457372358144 0,00781,25 
100 90,3 §044,34144,74377,5900$s1391 0,00390,625 
10045507 364125446, 25 155 ,64670,611 3 ©,00195,3125 
10022,51348.29291,29154,65611,7 367 0,00097,55625 
10011,24941,39987,98758,84395151805 0,00048,82812,5 ” : 
1000; ,623 12,602 20,853 66,18495 91839 0,00024,41406,25 
10002,81116,78773,01323,99249,643 a5 ©,00012,20703,125 
10001,40548,51694,72581,62767.3271¢ 0,00006,10351,562¢ 
10000,70271,78941,14355»38811;70845 _—_|0,00003,05175,78125 
10000,35135,27745,18566,08581,37077 0,00001,52587,89052,5 


10000,17567,48442, 267 38,3 3846,78274 
10000 ,08 78 3,70 363,46121,46574,07431 
10000.04391,842 17,3 1672,36281,88083 
$00 08 1999100 7,59543/003 9707739 _ 
10000,01997,95873,§0204,09754,72940 
10000,005 48,97921,68211,14626,602 50,4 
19000,00274,48957,07 382,95091,25449,9 
10000300 137,24477, 595 10,83 282,69572,5 
10000 ,00068,62238,56210,25737,18748,2 
£0000,000 34,3 £41y,22218,33912,75020,8 
:0000,00017,155 §9359637,84719,93879,1 
0000,00008,57779,7945 1,03051,17588,8 
0000,00004,28889,8963 3,54198,42901,3 
0000,00002, 14444,94793 2777523429704 
10900,00001,07222,47391,34050,76 926,8 
0000,00000,§ 361152 3694,13 3171483154 
0000,02000, 26 805,61846,707 31,51508,7 
0000,00009, 1 34.02,8992 3 ,26383,99277,7 
6000,00000,0 6 701,,40461,60 945,55519,6 
0000,00000,03 359,79 239579 91 1,91730,0 
0000,00000,016 75,35115,39815,62857,6 
0009,00 000,008 37,6755 7,69872,72426,9 
0000,00000,00418,83773,84927,59087,9 
0000,60000,002 09,41839,42461,60262,5 
0000 ,0.0000,0 0104,70 944,7 1230,25311,0 
0900, 90600,0 095 2,3§472,35514,93950,4 
2000,00900,0 0026,17735,17807,46048,9 
0€00,00900,090 1 3,08863,08903,72167,3 


0000,00000,00006,544 34,0445 1,8 5869,75_ 
9000,00000,9000 3,27217,02225,92881,337 


2000,00000,0000 163608, 511 12,96427,283 


2000,00000,00000,81804,25 5 55,48210, 295 


3000,00000,06000,40902,12778, 24104, 311 
9000,00000,00000,2045 1,06389,1205 1,946 


3000,00009,00000, 10225,§3194,56025,921 L 


1©,00000,76293,94531,25 
j0:00000,38146,9 7265 ,625 
©,00000,1907 3,48632,8125 
0:00000,095 35,74 316,40625 
9,00000,04768, 37158,20312,5 
,00000,02 384,185 79,10156,25 
000000,01192,09289,55078,125 
}©,00000,005 96,046 44,775 39,0625 
2300000, 00298, 02 323, : 38760,53125 
390000,00149,01161,,19384,76562,5 
300000,00074,50580,59652,38281,25 
0309000,00037; 25 2190, 29846, 19140,625 
00000,00018,62645 21492 3,09570,3129 
9300000,00009, 31 32 25574615478 Sts 625 
Kgeeone aap c of 87. paal 3878 1 %, $. 
pw on if 
2,00000,00001, 16415, Mee nea aie ; 
0,00000,00000,58207,66091 ,34674,07226,5625 
100000,00000,,29103,83045,673 37,03612,281 2g 
0,00000,00000,145 51,915 24,8 3668.5 1 806,64062,5 
$,00000,00000,07275,9576t,41834,25903,32031,25 
}0,00000,00000,03637,97880,70917,12951,66015,625 
0;00000,00000,018 18,989.40, 3 §458,55475,8 3007,8125 
0 ,00000,00000,00909,49470,17729,28237,91503,9062 5 
10,00000,00000,0 045 4,7473 $,08864,64118,95751,95 312 
10;Q0000,00000,00227,39367, 5 4432,3 205 9,47875,07656 
0,00000,00000, 001 # 3,68683,77216,16029,7393 7,08828 
P00000,00000,00056,84341,88608,08014,86968,90414 
0,00000 ,00000,0002 8,42 170,94 304,0400: 43 1O 
10,00000,00000,00014,21085,47142,0200 3571742,24853 
00000,00000, 000 07, 10$42,73 §76,01001,85871,12436 
10,00000,00000, 00003 ,55271,36788,00500,9293 52$6233 
0,00000,00000,00001,7763 5,683.94,00250,46467,78106 


}9,00000,00000 ,00000,88817,84197,001 25,2323 32,8906 3 
9,00000,00000, 00000, 4.44 08, 92098, 5006: 2,616 16,94526 


}000,00069 00000, 05 112,76597,28012,947™4 0,00090,00000;00000;2 2204,46049,250 31,30808, 47263 


1900,00000,00000,025 56,38298,64996,479N) 


90000,00000 00000, 11102, 23024,625 15,65404,23632 


}000,00000,00090,91273,19149,3 2003,235 P|0,0090,00000,00000,05 $$ 451151231257, 83702, 1181 


FIGURE 3.3. Briggs’ computation of successive roots of 10, Briggs (1624)” 


> Fig. 3.3 is reproduced with permission of Bibl. Publ. Univ. Genéve. 
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This gives the interpolation polynomial (Theorem 1.1, shifted) 


p(a) =1.6434526765 + (a — 44) (o.0097s98373 


3.11 
( ) a — 45 


2 


x — 46 
+ 


(—0.0002145194 + 0.0090002277) ; 

for which some selected values with errors are given in Table 3.1. The results are 
quite good despite the ease of computations. By adding additional points, one can 
increase the precision whenever this is desired. 


TABLE3.1. Errors of interpolation polynomial 


x p(x) log (x) err 
44.25  1.645913252 1.645913275 _2.34- 10° 


44.50 1.648359987 1.648360011 2.42. 1078 


44.75 1.650793026 1.650793040 1.35. 1078 
45.25 1.655618594  1.655618584 —1.05- 10° 


45.50 1.658011411 1.658011397 —1.43-10-8 


45.75 1.660391109 1.660391098 —1.04- 107° 
46.25 1.665111724 1.665111737__1.32- 10° 


46.50 1.667452930 1.667452953 2.34. 1078 
46.75 1.669781593 1.669781615 2.24. 1078 


Before going on with the calculus of logarithms, we make a little excursion into 
geometry. 


Computation of Areas 


The determination of areas and volumes exercised the curiosity of mathematicians 
since Greek antiquity. Two of the greatest achievements of Archimedes (283-212 
B.C.) were the computation of the area of the parabola and of the circle. The early 
17th century then saw the computation of areas under the curve y = x° with either 
integer or arbitrary values of a (Bonaventura Cavalieri, Roberval, Fermat). 


Problem. Given a, find the area below the curve y = x* between the bounds 
x=Oandz=B. 
Solution (Fermat 1636). We choose @ < 1 but close to 1 and consider the rect- 
angles formed by the geometric progression B, 0B, 6B, 6°B,... (Fig.3.4b), of 
height B®, 6° B%, 677 B*, 67 B*,.... Then, the area can be approximated by the 
geometrical series 


Ist Rect. + 2nd Rect. + 3rd Rect. +... 
= B(1— 0)B* + B(é — 67)6° B® + B(e? — 97)679 B® +... 
C2) = B+1(1 — 9) (1 4 Atl 4 g2a+2 ) — Bot eh 
mes 1 — gal? 
geometrical series 
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FIGURE 3.4a. Fermat 1601—1665° FIGURE 3.4b. Fermat’s calculation of the 
area below x* 


ifa+1 > 0 or, equivalently, a > —1 (see Eq. (2.12)). Let 9 = 1 — € with e small. 
Then, 1 —0 =c¢, 0°! =1-(a+1)e+... by Theorem 2.2. Consequently, 
1-0 E 1 

—_— » —— = — for «0. 

1-—6¢1)  (a+lje atl es 
The sum of the rectangles (3.12) approximates (for a > —1) the area S' from 
above. If we replace the heights of the rectangles by 0° B“, 6?°B“,... we get an 
approximation of S' from below. In this situation, the value (3.12) is just multiplied 
by 6°, which, for 6 — 1, tends to 1. Therefore, both approximations tend to the 
same value and we get the following result. 


(3.2) Theorem (Fermat 1636). The area below the curve y = x° and bounded by 
x =Oand x = B is given by 
Bert 


S= j -1, 
a+l1 Ue 


Area of the Hyperbola and Natural Logarithms 


In the month of September 1668, Mercator published his Logarithmotech- 
nia, which contains an example of this method (i.e., of infinite series) in a 
single case, namely the quadrature of the hyperbola. 
(Letter of Collins, July 26, 1672) 
Fermat’s method does not apply to a hyperbola y = 1/s. In fact, the geometric 
sequence of abscissae B, 6B, 6B, 6°B, ... becomes, for the areas, the sum 
(1 — 6)(1+1+1+...), whose partial sums form an arithmetic progression. 
This motivates the following discovery (made by Gregory of St. Vincent in 1647 
and Alfons Anton de Sarasa in 1649; see Kline 1972, p. 354): the area below the 
hyperbola y = 1/x is a logarithm (see Fig. 3.5). 


3 Fermat’s portrait is reproduced with permission of Bibl. Math. Univ. Genéve. 
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Same areas 


of 


1 2 3 4 5 (6 
FIGURE3.5. The area of the hyperbola as a logarithm 


We observe (by contracting the x-coordinates and stretching the y-coordi- 
nates) that, e.g., Area (3 — 6) = Area (1 — 2). Therefore, 


Area (1 > 3) + Area (1 — 2) = Area (1 — 6). 
This means that the function In(a) = Area (1 — a) satisfies the identity 
In(a) + In(b) = In(a- b) 


and is therefore a logarithm (the “natural” logarithm). 


a 

| 2: 

+X 

a ee ae cen 

5 In(1 + a) 
areas | —a/2 
ere. 
—-xX i. 
L 0 | | | | | | | | | J 


FIGURE 3.6. Term-by-term integration of the geometrical series 


Mercator’s Series. After a shift of the origin by 1 we have that In(1+a) is the area 
below 1/(1 + x) between 0 and a. We substitute 1/(1 + 2) = 1—a+a?—23+... 
(formula (2.12)) and insert for the areas below 1, x, x?,... between 0 and a the 
expressions of Theorem 3.2: 


a er 
a, 9” 3” 4? 
(see Fig. 3.6). In this way, we find, after replacing a by x (N. Mercator 1668), 
Po ee ae 
(3.13) In(fl+a¢)=a#-—+—-—+—-+H.... 


2 3 4 5 


The convergence of this series for various values of x is shown in Fig. 3.7. With 
the value z = 1 this series becomes 
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(3.13a) In2=1 


a beautiful formula of limited practical use (see Table 3.1). For still larger values 
of x the series does not converge at all. 


FIGURE 3.7. Convergence of x — a + = a ss 5 ere = to In(1 + x) 


Gregory’s Series. Replace x in (3.13) by —2: 


14 In(1 = Mc En 
(3.14) n( x) x ; 3 ri ; 


and then subtract this equation from (3.13). This gives (Gregory 1668) 


3.15 pee («+5 iets ian SN) ) 
G5) i 5 ey 


Examples. Putting « = 1/2 in (3.14) and x = 1/3 in (3.15) we obtain the follow- 
ing two series for In 2: 


1 1 af 
eae) WS orgs gigs age 

il 1 1 1 
3.15 n2 =2(5 ae ee ee =r 
ioe) a) Bae 5 .ge gran 
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TABLE 3.2. Convergence of the series for In2 


(3.13a) — (3.14a) (3.15a) 
1.000 0.500 0.667 

0.500 0.625 0.6914 

0.833 0.667 0.69300 


0.583 0.6823 0.693135 

0.783 0.6885 0.6931460 

0.617 0.6911 0.693 14707 

0.760 0.69226  0.693147170 

0.635 0.69275 ~—0.6931471795 

0.746 0.69297 —0.693147180559 
10 0.646 0.693065 0.6931471805498 
11 0.737 0.693109 =: 0.6931471805589 
12 0.653 0.693130 0.69314718055984 


OCONIDWRWNH|S 


The performance of these three series (3.13a), (3.14a), (3.15a) for In 2 are com- 
pared in Table 3.2. It is obvious which one is best. 


Computation of In p for Primes > 3. Because of (3.1), it is sufficient to compute 
the logarithms of the prime numbers. The logarithms of composite integers and 
rational numbers are then obtained by addition and subtraction. The idea is to 
divide p by a number close to it for which the logarithm is already known. Then, 
we can apply series (3.15) with a small value of x and obtain rapid convergence. 
For example, for p = 3 we write 


gti TG 5 8 
2 2 1-2 5 
so that 
i144 
(3.16) In3 = In 5 +In2=In-—3 +n. 


5 


Another possibility is 3 = (3/4) - 4, which leads to 


1+7 
t= 


(3.17) In3 = 2In2—I1n a 
7 


Still better is the use of the geometric mean of the above expressions: 


144 

a= any => he aoa in aa 

8 2 2 1- 
[25 3 1 1. 14+4 
3.18 =,/—.V24 In5 =—n24 -1 i 49 
( ) 5 5A => Ind re ge ig aes 


19 
/4 1 1.144 
T= = VB => Dt 2in ee ete me 


ie 
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and so on. The larger p is, the better the series (3.15) converges. The first values 
obtained in this way are 


In(1) = 0.000000000000000000000000000000 
In(2) = 0.693147180559945309417232121458 
In(3) = 1.098612288668109691395245236923 
In(4) = 1.386294361119890618834464242916 
In(5) = 1.609437912434100374600759333226 
In(6) = 1.791759469228055000812477358381 
In(7) = 1.945910149055313305 105352743443 
In(8) = 2.07944154167983592825 1696364375 
In(9) = 2.197224577336219382790490473845 
In(10) = 2.302585092994045684017991454684 . 


The improvement of this calculation (compared to that of Briggs), achieved in 
only a few decades (from 1620 to 1670), is obviously spectacular. It demonstrates 
once again the enormous progress made in mathematics after the appearance of 
Descartes’ Geometry. 


Connection with Euler’s Number. The connection between the natural logarithm 
and e is established in the following theorem. 
(3.3) Theorem. The natural logarithm \n x is the logarithm to base e. 


Proof. We apply the natural logarithm to the formula of Theorem 2.3. This gives, 
using (3.5) and (3.13), 


r\N x x x 
in(1+=) =N-In(1t+>)=N-(S-S G4...) se, 


so that Ine” = x. 


We thus obtain a geometric interpretation of e: it is the number for which the 
area under the hyperbola y = 1/a between 1 and ¢ is equal to 1 (see Fig. 3.8). 


_, | |, 
1 2 e= 2.71828... 


FIGURE 3.8. Geometric meaning of e 
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| MIP. 


0 1 Z 3 4 -2 ; -l % 1 } 2 


FIGURE3.9a. The functions y = x° FIGURE3.9b. The functions y = a” 


Arbitrary Powers. Logarithms allow us to compute (and define) abritrary powers 
as follows (Joh. Bernoulli 1697, Principia Calculi Exponentialium, Opera, vol.1, 
p. 179): we use a = e!™@ and get 


(3.19) b (elnaye — pbIna. 


—_—— 


Graphs of these functions, considered either as a function of a or as a function of 
b, are sketched in Figs. 3.9a and 3.9b. 
Exercises 


3.1 (Newton 1671, Method of Fluxions, Euler 1748, Introductio, 8123). Show 
that 2 = (4/3) - (3/2) yields 


1+2 1+2 1+¢ 
In2=I1n 2) ++ In cae In3=I1n 2.) + In 2, 

1-4} i— 1—i 

FB Tt ~ 5 


which allows the simultaneous calculation of In2 and In3 by two rapidly 
convergent series (3.15). 


3.2 (Newton 1669, “Inventio Basis ex Area data’’). Suppose that the area z under 
the hyperbola is given by the formula 


Bs gy A ge 8 = ay SB 
= 2 xv + 30 qv + 3x ota Jats 
Find a series for x = e* — 1 of the form 
= 2 3 4 
= Z+AQZ +0a3z2° +a4z° +... 


and (re)discover the series for the exponential function. 
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[1.4 Trigonometric Functions 


Sybil: It goes back to the dawn of civilization. 
(J. Cleese & C. Booth 1979, Fawlty Towers, The Psychiatrists) 


Measuring Angles. One of the oldest interests in geometry is the measurement 
of angles, mainly for astronomical purposes. The Babylonians divided the circle 
into 360°, probably because this was the approximate number of days in the year. 
Half the circle would then be 180°, the right angle 90°, and the equilateral triangle 
has angles of 60° (see Fig. 4.1a). Ptolemy’, in his Almagest, A.D. 150, refined the 
measurements by including the next digits in the number system in base 60, then in 
vogue, partes minutae primae (first small subdivisions) and partes minutae secon- 
dae (second small subdivisions). These became our “minutes” and “seconds”. But 
360° is not the only possibility. Many other units can be used; e.g., in some tech- 
nical applications we have grades, where the right angle has 100 grades. However, 
as for logarithms, there is a natural measure, based on the arc length of a circle 
of radius 1, the radian (see Fig. 4.1b). Here, the arc length of half of the circle is, 
with the precision computed by Th. F. de Lagny in 1719 and reproduced by Euler 
(with an error in the 113th decimal place, which is corrected here), 


3.141592653589793238462643383279502884197 16939937510 
58209749445923078164062862089986280348253421170679 
821480865132823066470938446... . 


For this somewhat unwieldy expression W. Jones (1706, p.243) introduced the 
abbreviation 7 (“periphery”). Then the angle of 54° drawn in Fig. 4.1 measures 
547/180 = 0.9425 radians. 


eae 90 80 79 a w3 
130 } iy, 50 1 m4 
140 40 
150. 30 _T/6 
160. 20 
170 10 
09 
180+ =O + t 
=I 0 fo i 0 i 
FIGURE4. 1a. Babylonian degrees FIGURE 4.1b. Angle measured by arc length 


Definition of Trigonometric Functions. How can one measure an angle with a 
rigid ruler? Well, we can only measure the chord (see Fig. 4.2), and then, with the 
help of tables, try to find the angle, or vice versa. Such tables have their origin 
in Greek antiquity (Hipparchus 150 B.C. (lost) and Ptolemy A.D. 150). The sine 
function, which is connected to the chord function by sina = (1/2)chord (2a), 
has its origin in Indian (Brahmagupta around 630) and medieval European science 


4-77 Tore patos, Ptolemeus, Ptolemius, Ptolemée, Tolomeo, I[Iromemem, .... 
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(Regiomontanus 1464). This function, originally named sinus rectus (i.e., vertical 


sine), is much better adapted to the computation of triangles than the chord func- 
tion. 


) chord 


0 1 


FIGURE 4.2. The Chord Function of FIGURE 4.3. Definition of sin, cos, tan, 
Ptolemy and cot 


(4.1) Definition. Consider a right-angled triangle disposed in a circle of radius 
1 as shown in Fig. 4.3. Then, the length of the leg opposite angle a is denoted by 
sina, that of the adjacent leg by cos a. Their quotients, which are the lengths of 
the vertical and horizontal tangents to the circle, are 

sin @ COS a 


and cota =— ; 
cos@ sin a 


tana = 


These definitions apply immediately to an arbitrary right-angled triangle with 
hypotenuse c and other sides a, b (with a opposite angle a): 


(4.1) a=c:sina, b=c- cosa, a=b- tana. 


While in geometry angles are traditionally denoted by lowercase Greek let- 
ters, as soon as we pass to radians and to the consideration of functions of a real 
variable (see the plots in Fig. 4.4), we prefer lowercase Latin letters (e.g., x) for 
the argument. Many formulas can be deduced from these figures, such as 


sin0=0, cos0=1, sinn/2=1, cos7/2=0, sinn =0, cos7 = —1, 
(4.2a) sin(—x) = —sina, cos(—x) = cosx 
(4.2b) sin(x + 7) = —sing, cos(a + 7) = — cosa 
(4.2c) sin(x + 7/2) = cosa, cos(a + 7/2) = —sinz 
(4.2d) sin? ¢ + cos? x = 1. 


The functions sin x and cos x are periodic with period 27, tan x is periodic with 
period 7. 
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3n/2 Qn Sn/2 3x \ 


FIGURE 4.4. The trigonometric functions sin x, cos x, and tan x 


Fig. 4.5 reproduces a drawing of the sine curve on page 17 of A. Diirer’s 
Underweysung der Messung (1525). Diirer calls this curve “eynn schraufen lini” 
and claims it is useful for stonemasons who construct circular staircases. 


FIGURE4.5. A sine curve in Diirer (1525)" 


Curious geometrical patterns arise when sin n is plotted for integer values of 
n only (Fig. 4.6, see Strang 1991, Richert 1992). 
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FIGURE 4.6. Values of sin 1, sin 2, sin3, ... with n in logarithmic scale 


Reproduced with permission of Dr. Alfons Uhl Verlag, Nordlingen. 
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Basic Relations and Consequences 


These equations have a venerable age. Already Ptolemy deduces ... 
(L. Vietoris, J. reine ang. Math. vol. 186 (1949), p. 1) 


Let a and ( be two angles with arcs x and y, respectively. 


(4.2) Theorem (Ptolemy A.D. 150, Regiomontanus 1464). 


(4.3) sin(x + y) = sinxcosy + cosxsin y 


(4.4) cos(a + y) = cosxcosy — sina sin y. 


Proof. These relations can be seen directly for 0 < x,y < 7/2 by inspecting the 
three right-angled triangles in Fig. 4.7. All other configurations can be reduced to 
this interval with the use of formulas (4.2b) and (4.2c). 


[— sin B cos 


0 : cos B cos a 1 


FIGURE 4.7. Proof of formulas (4.3) and (4.4) 


By dividing the two equations of Theorem 4.2, we obtain 


sin x cos y + cos x sin tanz + tan 
Cs): dei ee 
cosxcosy—sinzsiny 1—tanztany 


Further Formulas. Replacing y by —y in (4.3) and (4.4) yields 


(4.3’) sin(a — y) = sinx cosy — cosxsiny 


(4.4’) cos(a — y) = cosx cosy + sin xsin y. 
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If we add relations (4.3) and (4.3’) we obtain sin(x + y) + sin(z — y) = 2- 
sin x cos y. Introducing new variables for x + y and x — y, namely 


xety=u ; x =(ut+v)/2 
or equivalently 
L-YyY=v y = (u—v)/2, 


we obtain the first of the following three formulas: 


(4.6) sinu+siny =2-sin(“* 


(4.7) cosu + cose = 2 cos(“ = 


2 
~~) ; (- ”\ 
- sin : 
2 2 


(4.8) cosv — cosu = 2- sin( 


The others are obtained similarly. 


Putting x = y in (4.3) and (4.4) gives 


(4.9) sin(2x) = 2sinxcosx 


(4.10) cos(2a) = cos? x — sin? « = 1 — 2sin? x = 2cos? x — 1. 


If we replace x by 2/2 in (4.10) we obtain 


= [2 1—cosx GN let. 1+cosxz 
(4.11) sin(=) =+/——, cos(=) = +/——. 


Some Values for sin and cos. The proportions of 
the equilateral triangle and of the regular square 
give sin and cos for the angles of 30°, 60°, and of 
45°. For the regular pentagon see the figure (Hip- 
pasus 450 B.C.): the triangles ACE and AEF be- 
ing similar, we have 1 + 1/a = x, which im- 
plies that x = (14+ V5)/2, ie., the point F 
divides the diagonal CA in the golden section 
(see Euclid, 13th Element, §8); thus we find that 
sin 18° = 1/(2z). A list of the values obtained is 
given in Table 4.1. For a complete list of sin a for 
a = 3°, 6°, 9°, 12°... see Lambert (1770c). 


De Moivre’s Formulas. By replacing y by nx in (4.3) and (4.4) we get the recur- 
rence relations 


(4.12) sin(n + 1)a = sina cosna + cos x sin nz, 


(4.13) cos(n + 1)x = cos x cosnaz — sin x sin na. 


Starting from (4.9) and (4.10) and applying (4.12) and (4.13) repeatedly, we find 
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TABLE4.1. Particular values for sin, cos, and tan 


a tana 
0° 0 
15° 2-3 
1° (3V5—5)\/54+V5 
10V2 
30° M3 
36° V5—V5(V5-1) 
2Vv2 
45° 1 
60° v3 
75° 2+V73 
90° oe) 
cos(3x) = cos? x — 3sin? x cosa 
sin(3x) = 3 sin x cos” x —sin® x 
cos(4) = cos* x — 6sin? x cos? x + sin’ x 
sin(4xz) = Asina cos® x — 4sin® x cos x 
cos(5a) = cos” # — 10sin? x cos® x + 5sin* x cosa 
sin(5x) = 5sin x cos* x — 10sin® x cos? x + sin? x. 
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Here we discover the appearance of Pascal’s triangle; the computation is precisely 
the same as in Sect. I.2 (Theorem 2.1). Thus, we are able to state the following 
general formulas (found by de Moivre 1730, see Euler 1748, Introductio 8133): 


(4.14) 


—l1 
cosnx = cos” x — mn) sin? x cos”? x 
—1)(n-—2)(n-—3 
+ Aa ES) a 3 te ) sin? x cos”—* ¢ — 
—1)\(n-2 
sinnz = nsinxcos”—! x — mn un) sin® x cos”? x 
1 2 3 4 
Mn = (n= 2)(n=3)(M=4) 5 oocneB p _ 


46 I. Introduction to Analysis of the Infinite 


Series Expansions 


Sit arcus z infinite parvus; erit sin z = zetcosz=1;... 
(Euler 1748, Introductio, §134) 
While all the above formulas (4.5) through (4.14) have been derived only with the 
use of (4.3) and (4.4) together with (4.2a), we now need a new basic hypothesis: 
when z tends to zero, the “sinus rectus” merges with the arc. Since we are mea- 
suring the angle in radians, it follows that the closer x is to zero, the better sin x is 
approximated by x. We write this as 


(4.15) 


sinz 2x for zx — 0. 


We now apply the same idea as in the proof of Eqs. (2.18) and (2.19): in de 
Moivre’s formulas (4.14), we set x = y/N, n = N, where y is a fixed value, 
while N tends to infinity and z tends to zero. Then, because of (4.15), we replace 
sinx by x and cosz by 1. Also, since N — on, all terms (1 — k/N) become 
1. This then leads to the formulas, in which we again write x for the variable y 
(Newton 1669, Leibniz 1691, Jac. Bernoulli 1702), 


x x x x 
(4.16) cst=1l-p +a ata a 
i x” x xv 
(4.17) ee eae 


Newton’s derivation of these series is indicated in Exercise 4.1; the above proof is 
due to Jac. Bernoulli as well as Euler’s Introductio, §134. 


Remark. Some care is necessary when replacing cos(y/N) by 1 for large values 
of N, because this expression is raised to the Nth power. For example, 1 + y/N 
tends to 1 for N — ov, but (1 + y/N)% does not (see Theorem 2.3). Rescue 
comes from the fact that cos(y/N) tends to 1 faster than 1 + y/N. Indeed, we 


have 
N/2 
— 1 


cos (y/N) = (1-sin?(y/N)) & 1-5 


by (4.2d), Theorem 2.2, and (4.15). 


y 
N 


The convergence of the series (4.16) and (4.17) is illustrated in Fig. 4.8. We 
apparently have convergence for all x (see Sect. III.7). It can be observed (the com- 
putations were intentionally done in single precision) that problems of numerical 
precision due to rounding errors arise beyond x = 15. 


The Series for tan x. We put 


sin x 3 5 7 
y = tang = — =a2 4 agxv" + 45x" + a7x' +... . 
COS & 


1.4 Trigonometric Functions 47 


FIGURE 4.8 Series sing =a — +2 — and cos = 1— 2+ 2 — 
To find a1, a3, @5,... we multiply this formula by cos x and use the known series 
(4.16) and (4.17) 
seen ( + agx* + asx? + )( aga ) 
L-—+—-—...=(a,e+a3x" + a50° 4+... ee 
6 ' 120 re He s 2" 9A 


Comparing the coefficients of x, x, and x° we get 


l=a, is a ee SO 
6 2 120 24 2 
which yield 
1 1 1 1 1 1 2 
a, = 1, ae EtG a Oey age ee 
If we continue, we find the series 
(4.18) 


x? 22° 17a" 62.29 1382 41! 21844 7}8 


tang ="+ > +5 + Bis + 2835 1 155925 * 6081075 


No general rule is visible. However, there is one, based on the Bernoulli numbers 
(1.29) (see Exercise 10.2 of Sect. II.10). 


Ancient Computations of Tables. From the values of Table 4.1, which are known 
since antiquity, we can find with the help of (4.3’) and (4.4’) the values of sin 3°, 
cos 3°, or, as then usual, chord 6°. The half-angle formulas (4.11) then allow the 


48 


computation of chord 3°, chord 1 i, chord 3°. but not chord 1°. Ptolemy observed 
that chord 3° is approximately half of chord 1 i. Therefore one might guess that 
2 - chord 14°, which gives, in base 60 (see Aaboe 1964, p. 121), 


chord 1° = 
(4.19) 
chord 1° 


Then, the values of sin and cos for all the angles 2°, 3°, 4°, etc. are obtained with 
the help of (4.14). Around 1464, Regiomontanus computed a table (“SEQVITVR 
NVNC EIVSDEM IOANNIS Regiomontani tabula sinuum, per singula minuta 
extensa ...”) giving the sine of all angles at intervals of | minute, with five deci- 
mals. See in Fig. 4.9 a table of tan x written in his hand (usually with four correct 


= 0;1,2,50 
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(correct value 0; 1, 2,49, 51, 48,0, 25, 27, 22,...). 


decimals). 
* . t 4. 
La | frat ap NOR CAG ADDR Kl he WRAY 
He 12 sim ob nel = Phe a aE RR ig Ou Ola 
[BIghS oye cgseons fat Saas SS ees aes 
i OE et ere PL i a btiedicag tl mer et esi ve b8 
PB dad ty di Town eon An Swe hao ho el Aly pede a 
os RISA OATT Lowa a mon AAI Re Pr et Ot eee ‘ 
ain ee 1 -—- = 
mS H i } ; tn 
Lad eee ints minga-wmadroreno -vwdtel’ nw 
Z i Weoos BESS nnn rr RARAK KRAAHARREBAR 
geet ‘ : Bu, 
Uv “Sb born he Gaanok-2=b AK, we ARMM 
tal Zee HevrrApondtrnraermh mee nse 
<x * placa ahd aed mgegeg eh HT ort ogee gar WH. 
i PS yh PON FROM TOMS OMA OE RnR RS & 
S SSLOR ARR MHHIDRIVO Be Doe mm he 
Me OP ee Re as eT RoR Oe 0 ee el gp Ee el lee i ee ae OE ~~ 
Be a ee : ne it eas 
bd -Nweerbo nae Doerr wr et rOR RAD an a 
op HO TAIN A eR AMER RAO WAHL HTT so TANS EE 
wt Ree imei ae Fe ey * 
a ye TNO NSS naman naAkny Ure ees 
. = wr +e, ? 6 ”, 
¢ ws Rast Sess Sours whs Sr eee eee ° 
ion = wml om On a res hesegudboag ees ; 
ib ee WS we hw ate ake ol 
a i " ig - 
ees { a Be ne 
7 ge 2 NR ER wa go ke Nee Se oe 


FIGURE 4.9. Autographic table of tan a by Regiomontanus (see Kaunzner 1980)* 


A very precise computation of sin 1° was made by Al-Kashi (Samarkand in 


1429) by solving numerically the equation (see Eq. (1.9)) 


(4.20) 


with the help of an iterative method and giving the solution in base 60 (“We ex- 
tracted it by inspired strength from the Eternal Presence .. .”, see A. Aaboe 1954) 


—Az? + 3x = sin 3° 


sin 1° = 0; 1,2, 49, 43, 11, 14, 44, 16, 19,16... . 


Here is the true value in base 60 calculated by a modern computer, 


sin 1° = 0; 1,2, 49, 43, 11, 14, 44, 16, 26, 18, 28, 49, 20, 26,50, 41,.... 


> Reproduced with permission of Niirnberger Stadtbibliothek, Cent V, 63, f. 30". 
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Once again, we see the enormous progress of the series method (4.17), which 
gives sin 1° = sin(7/180) = sin(0.0174532925 .. .) with only three terms as 


sin 1° © 0.0174532925199 — 0.0000008860962 + 0.000000000013496 
= 0.0174524064373 . 


Inverse Trigonometric Functions 


Trigonometric functions define sin, cosz, tanz, for a given arc x. Inverse 
trigonometric functions define the arc x as a function of sin x, cos x, or tan x. 


(4.3) Definition. Consider a right-angled triangle with hypotenuse 1. If x de- 
notes the length of the leg opposite the angle, arcsinx is the length of the 
arc (see Fig. 4.10a). The values arccosx and arctan x are defined analogously 
(Figs. 4.10b and 4.10c). 


1 1 1 
ao ; | Se 
y_ 
1 arcsin x 1 arccos x 1 
x arctan x 
0 = a ae a | 0 1 


FIGURE 4.10. Definition of arcsin x, arccos x, and arctan x7 


Because of the periodicity of the trigonometric functions, the inverse trigono- 
metric functions are multivalued. The so-called principal branches satisfy the fol- 
lowing inequalities: 

y=arcsing ©& «x=siny for —l<a<1, -17/2<y<7/2, 
y=arccost & xX=cosy for -l<a<1,0<y<q, 


y=arctangs & gx=tany for —o0 <a@<oo, —17/2<y< 7/2. 


Series for arctan x. 


If one really exposes something, it is better to give no proof, or such a proof 
which doesn’t let them discover our tricks (Es ist aber guth, dass wann man 
etwas wiirklich exhibiret, ma entweder keine demonstration gebe, oder eine 
solche, dadurch sie uns nicht hinter die schliche kommen.) 

(Letter of Leibniz ; quoted from Euler’s Opera Omnia, vol. 27, p. xxvii) 


The series for arctan x was discovered by Gregory in 1671. In 1674, Leibniz re- 
discovered it and published the formula in 1682 in the Acta Eruditorum, enthusing 
about the kindness of the Lord but without disclosing the path that led him to the 
result (see citation). We therefore search inspiration in Newton’s treatment of the 
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series for arcsin x in the manuscript De Analysi, written 1669, but published only 
40 years later (see formula (4.25) below). One can either compute the arc length 
or the area of the corresponding circular sector. The relation between the two is 
known since Archimedes (“Proposition 1” of On the measurement of the circle), 
and is also displayed by Kepler in Fig. 4.12. 


a) b) 


FIGURE 4.11. The derivation of the series for y = arctan x 


ec 


FIGURE 4.12. The area of the circle seen by Kepler 1615* 


Let x, a given value, be the tangent of an angle whose arc y = arctan x we want 
to determine (see Fig. 4.1 1a). Because of Pythagoras’ Theorem, we have 


(4.21) OA=V1+22. 


By Thales’ Theorem, applied to the two larger similar triangles shaded in grey, we 
have 


1 A 
(4.22) ORS Sgiteise: igen 


V1+2? V1+ a2. 
By orthogonal angles, the small grey triangle is also similar to the two other ones, 


and we have consequently 


Au (4.22) Az 


Ay = ———= + =’ ——.. 
ve V14+ 22 1+ 2? 


* Reproduced with permission of Bibl. Publ. Univ. Genéve. 


(4.23) 
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This means, that the infinitesimal arc length Ay is equal to the shaded area in 
Fig. 4.11b. The wanted arc y is therefore equal to the total area between 0 and x 
below 


1 2.12 
ee) ge ee a a a 
1+ 2? 
i.e., by Theorem 3.2 (Fermat), 
5h : a a? at | a ott 
: =arctang = * - —+—-—+—-—... 
# BP eg 


which is valid for |x| < 1. 


Series for arcsin x. 
A friend that hath a very excellent genius to those things, brought me the 
other day some papers, wherein he hath sett downe methods of calculating the 
dimensions of magnitudes like that of M" Mercator concerning the hyperbola, 
but very generall. .. His name is M” Newton; a fellow of our College, & very 
young ... but of an extraordinary genius & proficiency in these things. 
(Letter of Barrow to Collins 1669, quoted from Westfall 1980, p. 202) 
After the publication of Mercator’s book towards the end of 1668, in which the 
series for In(1 + x) was published, Newton hastened to show his manuscript De 
Analysi (Newton 1669) to some of his friends, but did not allow its publication. 
It was finally inserted as the first chapter of Analysis per quantitatum (Newton 
1711) published by W. Jones. Newton had not only found Mercator’s series much 
earlier, but was the first to discover the series 


he 6 lege: We ae boa. 
(4.25) arcsint="+5— +545 +oq6 7 0° 


and also the series for sin x and cos x (see Exercise 4.1). Newton’s proof for (4.25) 
was as follows. 


Proof. We suppose x given and want to compute the arc y for which x = siny 
(see Fig. 4.13). If x increases by Az, then y increases by Ay, which is 


Ax 
V1 — x? 
because the two shaded triangles in Fig.4.13 are similar. This quantity is the 
area of a rectangle of width Ax and height 1/./1 — x. Therefore, similar as in 
Fig. 4.1 1c, the total arc length y is equal to the area below the function 1/ VI — «2 


between 0 and x. Expanding this function by the Binomial Theorem 2.2 gives with 
a=-1/2 


(4.26) Ay ~ 


(4.27) Fas tlt ge 


and we obtain formula (4.25), once again, by replacing the functions 1, x”, x*,... 


by their areas (Theorem 3.2) «, 2°/3, 7°/5,.... 
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x 1 T AKB C E 


FIGURE 4.13. Proof of (4.25) for y = arcsin x; illustration from Newton (1669)° 


Computation of Pi 


... you will not deny that you have discovered a very remarkable property 
of the circle, which will forever be famous among geometers. 
(Letter of Huygens to Leibniz, November 7, 1674) 


Theref. the Diameter is to the Periphery, as 1,000,&c. to 3.141592653.589 
7932384.6264338327.9502884 197. 169399375 1.0582097494.4592307816 
.4062862089.9862803482.53421 17067.9+, True to above a hundred Places; 
as Computed by the Accurate and Ready Pen of the Truly Ingenious Mr. 
John Machin: Purely as an Instance of the Vast advantage Arithmetical 
Calculations receive from the Modern Analysis, in a Subject that has bin 
of so Engaging a Nature, as to have employ’d the Minds of the most Em- 
inent Mathematicians, in all Ages, to the Consideration of it. ... But the 
Method of Series (as improv’d by Mr. Newton, and Mr. Halley) performs 
this with great Facility, when compared with the Intricate and Prolix Ways 
of Archimedes, Vieta, Van Ceulen, Metius, Snellius, Lansbergius, &c. 
(W. Jones 1706) 
Archimedes (283-212 B.C.) obtained, by calculating the perimeters of the regular 
polygons of n = 6,12, 24, 48, 96 sides and by repeated use of formulas (4.11), 


the estimate 


10 1 
4.2 — =. 
(4.28) 827 <7 <35 


All attempts made in the Middle Ages to improve on this value were fruitless. Fi- 
nally, by applying Archimedes’ method, Adrien van Roomen (in 1580) succeeded 
in obtaining 20 decimals after years of calculation. Ludolph van Ceulen (=K6In) 
(in 1596, 1616) computed 35 decimals, which for a long time decorated Ludolph’s 
tombstone in St. Peter’s Cathedral in Leiden (Holland). In order to reach this pre- 
cision, Ludolph had to continue the calculations up to n = 6 - 2®°. 


Leibniz’s Series. From Table 4.1 we know that tan(7/4) = 1 and consequently 
arctan(1) = 7/4. Putting x = 1 in (4.24), we find the famous series of Leibniz 
(1682) 


(4.29) 


ees 
7 9 Il 13 


> The right-hand picture of Fig. 4.13 is printed with permission of Bibl. Univ. Genéve. 
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Although we agree with Leibniz about the undeniable beauty of his formula (“The 
Lord loves odd numbers”, see Fig. 4.14), we also see that it is totally inefficient for 
practical computations, since for 50 decimals we would have to add 10°° terms 
with “labor fere in aeternum” (Euler 1737). 


FIGURE 4.14. Leibniz’s illustration for series (4.29)° 


Much more efficient is the use of tan(7/6) = 1/,/3 (see Table 4.1), which 
leads to the formula 


1 1 1 1 

4.30 =2 3(1- — -sa + ), 
G30. a= Wall=a ate 7. t oe 
with which, by adding 210 terms “exhibitus incredibili labore”, Th. F. de Lagny 
computed in 1719 the value displayed at the beginning of this section. The series 
(4.25) for arcsin x can also be used; for example, because of sin(7/6) = 1/2, we 
have 

me 1 Mes 1-3 #1 -3°5 1 


4.31 ELE ae icc ees gem ee 
Ge) 6 s+sartoaeetss 267+ 


Composite Formulas. We insert u = tan x and v = tan y into (4.5) and obtain 


(4.32) arctan u + arctan v = aretan( 7 a ) 
— uv 


if | arctan u + arctanv| < 7/2. If we set u = 1/2 and v = 1/3, we see that the 
fraction to the right of (4.32) is equal to 1. This gives Euler’s formula (1737), 


1 1 
(4.33) - = arctan 3 + arctan 3 
for which the series (4.24) already converges much better. 
Especially attractive is the approach of John Machin, published (without de- 
tails) in W. Jones (1706, p. 243). Putting u = v = 1/5, we get 


2/5 


1 
2- arctan 5 = arctan (Tos 


= arctan —. 
) Res 


® Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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For u = v = 5/12 one obtains 


5 2-5/12 120 
2- arctan — = aretan(———--——_) = arctan —. 
12 1 — 25/144 119 
Finally, we put u = 120/119 and search for a v such that 
1- 1 
a =1, hence v= eee 
1— wv 1l+u 239 
All these formulas together give 
(4.34) Le errs er ee 
; — = 4- arctan = — arctan —— 
4 5 239° 


an expression for which the series (4.24) is particularly attractive for calculations 
in base 10 (see Table 4.2). “The Accurate and Ready Pen” of Machin found 100 
decimals in this way. 


TABLE 4.2. Computation of 7 by Machin’s formula 


1 0. 200000000000000000000000000 

3 —0. 2666666666666666666666667 

5 0. 64000000000000000000000 

7 —0. 1828571428571428571429 

9 0. 56888888888888888889 

11 —0. 1861818181818181818 

13. 0. 63015384615384615 

15 —0. 2184533333333333 

17 0. 77101176470588 

19 —0. 2759410526316 
21 0. 99864380952 
23 —0. 3647220870 
25 (0. 134217728 1 0. 004184100418410041841004184 
27 —0. 4971027 3 —0 24416591787083803627 
29 0. 185128 5 0. 2564723 14424647 
31 —0. 6927. 7 —0. 3207130658 
330. 260 9 0. 43669 
35 —0. 10 11 —0. 1 
= 0. 197395559849880758370049763 = 0. 004184076002074723864538214 


The search for other formulas of this type becomes a problem of number the- 
ory. Gauss, as a by-product of 20 pages of factorization tables, found (see Werke, 
vol. 2, p.477-502) 


= 12arctan oa + 8 arctan ES — darctan ee, 
18 57 239 


ALA ALA 


1 a 1 1 
= 12arctan 3g ¢ 20 arctan 57 + 7 arctan 339 + 24 arctan 768" 


Today, several million digits of 7 have been calculated. See Shanks & Wrench Jr. 
(1962) for a list of the first 100 000 decimals (the 100 000th digit is a 6). More 
details about old and recent history can be found in Miel (1983). 
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Exercises 


4.1 (Newton 1669, “Inventio Basis ex data Longitudine Curve’). Having found 
the series 2 = « + g2° + Ba? + Gua’ +... for the arcsin (see (4.25)), 
discover the series for x = sin z in the form x = z+agz°+a5z°+a7z"+... 
(similar to Exercise 3.2) and that of w = cos z by expanding w = V1 — 2? 
(see Fig. 4.15). 


Si ex dato arcu 2D Sinus AB defideratur ; zqua- 
tionis ~=x-+ 3x34 2x5 +25 x7, &c. fupra in- [ae 
ventz, (pofito nempe AB =x, «D =z, & Ae =1,) 7 ees 
radix extratta erit x =z —1%3 4 7525 — p27 x 
+ per? Ke. 

Et praterea fi Cofinum Ag ex ifto arcu dato cu- \ 
pis, fac A@( = Vj—zz) = 1423 4 et — 

I 


I 
8 cee ———— e | 
sa wt 3628800"? &e. i——_1 
FIGURE 4.15. Extract from Newton (1669), p. 17 


4.2 Understand Ptolemy’s original proof of the addition theorems (4.3) and (4.4) 
for the chord function (see Fig. 4.16). 


g ‘Propofitio iii. $3 
Faea| Otis cbozdis inequalium arcuum in femicirculo: 
INA fl arcus quo maio: minozé fuperat chozda nota fict, 
4 uN AMT Pein fmicirculo.a.b.4.fap:a stamectri.a.d.note fine cho: 
! AN ‘de.a.b.a.g.D1co notam ficri cbe:dam.b.g.nam per cozcls, 
RRSA TS rium pzimic buius note ctiam ficnt chorde.b.d.<.g.d. C Sint 
eae jn quadrilatero.a.b.g.d.diamctri.a.g.7.b.d.note.funt ¢ late 
2.2.b.7.g.d.oppofits nots.igif per pzemitfam quod fit cx.a.d.in.b.g.noti 
ict. Sed.a.d.cft nota:quia diameter circuti.ideo.b.g.nots fict : dquercbaf. 
Der bic plurimor arcaud chozdas cognoftes. Repice eni cho:dd arcus quo 
a 4 dnta pare circtiferentic fextd fupat.(.cho:dé arcue.1:.graduii:z fic dc alijo. 


FIGURE4.16. Ptolemy’s proof of formula for chord (a + (3); from Almagest, transl. by 
Regiomontanus, printed 1496’ 


Hint. Use (and/or prove) “Ptolemy’s 
Lemma”, which states that the sides and 
diagonals of a quadrilateral inscribed in 
a circle satisfy ac + bd = 6109. For 
the proof of the lemma, draw a line DE 
such that angle EDA equals angle CDB. 
So we have similar triangles 


EDAX CDB => b/d, =u/d 
DCE=DBA = > a/d,=v/c 
whence bd + ac = (u+v)61 = 0109. 


7 Figs. 4.15 and 4.16 are reproduced with permission of Bibl. Publ. Univ. Genéve. 
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4.3. The hyperbolic functions (Foncenex 1759, Lambert 1770b). For a given x 
let P be the point on the hyperbola u? — v? = 1 such that the shaded area of 
Fig. 4.17 (left) is equal to 2/2. Then, the coordinates of this point are denoted 
by (cosh a, sinh 2). 
a) Prove that 
(4.35) coshz = — nh 
Hint. The areas of the triangles ACB and PCQ are equal. Hence, the areas of 
ACPA and ABQPA are also equal and are equal to (In a)/2, if the distance 
between C and Q is denoted by a/\/2 (Fig. 4.17, right). 
b) Verify the relations 


(4.36) sinh(z + y) = sinha cosh y + cosh x sinh y 


cosh(a + y) = cosha cosh y + sinh x sinh y. 
c) The inverse functions of (4.35) — the area functions — are defined by 


y=arsinhe <& «x=sinhy for —-wo <4%< mw, -—w<y<o, 


y=arcosha << «x =coshy forl<x%<w,0<y<om. 


Provethat arsinhe =In(w~+Va*+1), arcoshe =In(w+Va?—-1). 


FIGURE 4.17. Definition of hyperbolic functions 


4.4 Verify (and use) Newton’s advice (Newton 1671, Probl. IX, §XLIX) for the 
computation of 7: by computing the area a under the circle y = a!/ 21 — 
x)'/? between a = 0 and x = 1/4 by binomial series expansion, show that 


nm = 24a + 3V3/4 


-u(22 1? Qoby Asa 2b jae. Sy 2A ) 3V3 


323 2 525 2-4 727 2-4-6 929 
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1.5 Complex Numbers and Functions 


Neither the true nor the false roots are always real; sometimes they are 
imaginary; that is, while we can always imagine as many roots for each 
equation as I have assigned, yet there is not always a definite quantity cor- 
responding to each root we have imagined. (Descartes 1637) 
Cardano (1545, in his Ars Magna) was the first to encounter complex numbers by 
asking the following question: divide a given line ab, say, of length 10 “in duas 
partes”’, so that the rectangle with these two parts as sides has area 40. Everybody 
can see (see Fig. 5.1) that the area of such a rectangle is at most 25, so the prob- 
lem has no real solution. But algebra gives us a solution, since the corresponding 
equation (see Eq. (1.3)) 2? — 10x + 40 = 0 leads to (“ideo imaginaberis ,/—15”) 


5+ 7-15 and 5— V—15. 


Although these formulas are perfectly useless and sophistic (“que uere est sophis- 
tica’’), they must contain an amount of truth, since their product 


(5 + V—15)(5 — V—15) = 25 — (—15) = 40 
is actually what we want (see Fig. 5.1). 
a ¢ 
5 pike mit 
+ AY 5 m:Rz mz19 
d 25 m:m:15 Gd.eft 4.0 


FIGURE 5.1. Excerpts from Cardano’s Ars Magna! 


During the following centuries, such “impossible” or “imaginary” (Descartes, 
see quotation) solutions of algebraic equations came up again and again, gave rise 
to many disputes, but proved to be more and more useful. Full maturity in their 
handling was achieved in the work of Euler, who also introduced later in his life 
the symbol i for \/—1. The above values are now written as 5+i./15 and complex 
numbers are of the general form 


c=atib, 


where a = Re (c) is called the real part, and b = Im (c) the imaginary part of 
c. The interpretation of a complex number a + ib as the point (a, b) in the two- 
dimensional complex plane is due to Gauss’ thesis (1799) (see Fig.5.2) and to 
Argand in 1806 (see Kline 1972, p. 630). 


' Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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c o3+2i 


\ 


ar 

vel 
FIGURE 5.2a. Complex plane and cubic FIGURES.2b. Complex plane in Gauss 
roots (1799) 


Complex Operations. For computation with complex numbers we keep in mind 
the relation i? = —1 and apply the usual rules for rational or real numbers. There- 
fore, the sum (or the difference) of two complex numbers 


c=at+i, w=utiv 


is the complex number obtained by adding (or subtracting) the real and imaginary 
parts. The product becomes (compare with Fig. 5.1) 


(5.1) c:w =au— bv +i(av + bu). 


To compute the quotient w/c we observe that the product of c with its complex 
conjugate 
(5.2) €=a-—i1b 
is real and nonnegative, namely c - € = a? + b?. Multiplying numerator and de- 
nominator of w/c by @ the quotient w/c becomes for c 4 0 

ww: au+tbv ,av—bu 


(5.3) =e Sean 


Cc c:t a? + b? a? + b2° 


Euler’s Formula and Its Consequences 


... how imaginary exponentials are expressed in terms of the sine and co- 

sine of real arcs. (Euler 1748, Introductio, §138) 
This formula, discovered by Euler in 1740 by studying differential equations of 
the form y” + y = 0 (see Sect. II.8), is the key to understanding operations with 
complex numbers. 


> Reproduced with permission of Georg Olms Verlag. 
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We define e*” by the series of Theorem 2.3 (with x replaced by iz), use the 
relations i? = —1, i? = —i, i+ = 1, 1° =i,..., and separate real and imaginary 


parts: 


(is on 4 
e* =1+ i+ m1 + 31 + m + 5 ahah 
2 73 xt Po) 
oa me ek 
iG x? x : x? x aes Bs ae 
=( — Sta. )ti(e- 4+ Ft...) = coset isine. 
ae” Se 
COS D sin x 


The result is the famous formula (Euler 1743, Opera Omnia, vol. 14, p. 142) 


(5.4) i 


e” =cosx+isinz. 


As a first application, we insert the particular values « = 7/2 and x = 7, which 
give 
eit /? — 4 and ee" =-1 


elegant formulas combining the famous mathematical constants 7, e, and 2 in won- 
derfully simple expressions. 


Polar Coordinates. Equation (5.4) shows that the point e*? has real part cos» 
and imaginary part sin y, i.e., it is the point on the unit circle at which the radius 
forms an angle y with the real axis (see Figs. 5.2a and 4.3). Consequently, each 
complex number can be written as 


(5.5) c=atib=r-e'”, 


where 
b 
(5.6) r=Va2?+b?=Vc-t and p= arctan (=). 
a 


We call r = |c| the absolute value of c and y = arg(c) its argument. Let 
c=r-el? and w=s-e 


be two complex numbers in polar coordinate representation. It follows from (4.2a) 
that ¢ = r - e~*? and from Theorem 4.2 that 


e’? . e = (cosy + isiny) - (cos@ + isin§) 


(5.7) = (cos ycos@ — sinysin§) + i(cosysin 6 + sin y cos 6) = e(P +), 


Therefore, we obtain for the product and quotient 


(5.8) c:W=Ts- elle t9) | Lanne Code 
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Here the polar coordinate form is especially illuminating: multiplication multiplies 
the radii and adds the angles; division divides the radii and subtracts the angles. 


Roots. We wish to know, say, °/c. Once again, polar coordinates perform the 
miracle, since roots of products are the products of the roots. However, we must 
be careful, because e2’" = 1 and e*’7 = 1 have cube roots e2'7/3 and e*’7/3, 
which are different from 1. Thus, there are three cube roots of c, 


(5.9) ¥%eo= Vr- ee/3 wr - a erstan/a) or. eile/3+4r/3) 


These, for c = 3 + 2i, are displayed in Fig. 5.2a. The next candidate, e°’" = 1, 
just reproduces the first of the roots and gives nothing new. The roots thus obtained 
form a regular star; of Mercedes-type for n = 3, of Handel’s Fire-Musick-type for 
n > 3. Fig. 5.3 represents the map z +> w = z® for varying values of z and its 
inverse function w + z = w!/3 = %/w. The animal that thereby undergoes 
painful deformations is known as “Arnold’s cat”. The inverse map produces three 
cats out of one. 


ie 


ag 


oy 


FIGURE 5.3. The function w = z° and its inverse z = w!/? 


Exponential Function and Logarithm. The exponential function can be ex- 
tended to complex arguments as follows: 


(5.10) e° = e* - ec = e*(cosb + isinb) for c=atib. 


©.e”, which is obtained 


This definition retains the fundamental property e°t” = e 
from Eq. (5.7). 

The nature of the logarithms of negative numbers gave rise to long and heated 
disputes between Leibniz and Joh. Bernoulli. Euler (1751) gave a marvelous sur- 


vey of these discussions, which were kept as secret as possible since such disputes 
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would have damaged the prestige of pure mathematics as an exact and rigorous 
science. The true nature of logarithms of negative and complex numbers was then 
revealed by Euler (““Denouement des difficultés precedentes’’) with the help, once 
again, of Eq. (5.4). Many of the contradictions of the earlier disputes were resolved 
by the fact that the logarithm of a complex number does not represent one number, 
but an infinity of values. We write c in polar coordinate form 


c=r-ellet2kt) = = 0,+41,+42,..., 


which is a product. In order to retain properties (3.1) and (3.7) for the logarithm 
with complex arguments, we define 


(5.11) In(c) = In(r) + i(y + 2kn), k =0,+1,+2,.... 


Fig.5.4 represents the map w = e* and its inverse. Since the imaginary 
part of the logarithm is simply y = arg(c) it is clear that, after each rotation 
pr yt 2n, the logarithms repeat again and again. 


z =Inw 


Lam A ITA 


FIGURE 5.4. The function w = e* and its inverse z = Inw 


A New View on Trigonometric Functions 


The shortest path between two truths in the real domain passes 
through the complex domain. 
(Jacques Hadamard; quoted from Kline (1972), p. 626) 
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Replacing x in (5.4) by —x we have e~** = cosx — isin x; by then adding and 
subtracting these formulas we obtain 


(5.12) singe 
21 
(5.13) ep 
D 
1 tL _ 41x 
(5.14) fips oe = = 


Thus, in the complex domain, trigonometric functions are closely related to the 
exponential function. Many formulas of Sect. 1.4 become connected with those 
for e”; e.g., de Moivre’s formulas (4.14) simply state that e*”* = (e’”)”. This is 
not a new proof, however, as we based it on Eq. (5.4), which was deduced from the 
series of (4.16) and (4.17), which were in turn proved using de Moivre’s formulas. 


Inverse Trigonometric Functions. If we insert in (5.12), (5.13), or (5.14) a vari- 
able u for e” and v for either sin x, cos x, or tan x, we obtain algebraic relations 
that can be solved for u. As a result, the inverse trigonometric functions are ex- 
pressed with the help of the complex logarithm as follows: 


(5.15) arcsing = —iln(ia + V1 — 2?) 
(5.16) arccos@ = —iln(x + iv/1 — 2?) 
(5.17) arctan: = 5 In(—~), 

2 CL 


Since the logarithmic function is many-valued, attention must be drawn to the cor- 
rect branch (i.e., value of & in (5.11)) of the function to be used. The last formula 
explains the striking similarity between the series of Eq. (4.24) for y = arctan x 
and Gregory’s series (3.15) for n((1 + )/(1 — x)). Also, Machin’s formula of 
Eq. (4.34) becomes equivalent to the factorization of the complex numbers 


1 wey eSey 


(5.18) = Bo 1 


a i 


Euler’s Product for the Sine Function 


... and J already see a way for finding the sum of this row + + t + 7 + fete. 
(Joh. Bernoulli, May 22, 1691, letter to his brother) 
One of the great mathematical challenges of the early 18th century was to find an 
expression for the sum of reciprocal squares 
1 1 1 1 
Joh. Bernoulli eagerly sought for this expression for many decades. Euler (1740) 
then found the following elegant solution: we know from algebra that, e.g., 
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(5.20) 1— Ar + Bz? — Cx? = (1- ax)(1 — Bx)(1— 2), 


where 1/a, 1/3, 1/y are the roots of the polynomial 1 — Az + Ba? — Cz°. 
Furthermore, the first of the so-called ““Viéte’s identities” is 


(5.21) A=a+68+%. 


Now, we apply the same principle fearlessly to the infinite series 


(5.22) sinc 1 x? #. xt 
, Cc 6 | 120 
with its infinite number of roots +7, +27, +37,... and Eq. (5.20) becomes 


#2 6-U D-H ODl-Des- 


=(-3)(-B)0-B)~ 


Comparing this relation with (5.22), the analog to (5.21) (with x replaced by ? ) 
becomes 
1 1 1 1 1 1 


+—5+—5+ + +..=5 


5.23 —= — + —— 
( ) m2 An? = On? 1602 25? 6 


and the sum (5.19) is 7? /6. However audacious this argument and however beau- 
tiful its result, its mathematical rigor was poor even by 18th century standards. 
Therefore, Euler later looked for a better proof (1748, Introductio, §156). We start 
with the factorization of z” — 1. 


Roots of Unity. The polynomial z” — 1 pos- 
sesses the roots z = WI = e2ikn/n p= 
0,+1,+2,.... Since e?** = 1, only n consec- 
utive values of k give rise to distinct roots. For e °/7 
example, for n = 7 these solutions are 


e Ain/7 


ft; er, eres, 
i —4i i —6i —6iTt/ 
enn, e maT eoere. e 6in/T eo 


—2in/7 


A factorization similar to (5.20) is also valid 
for polynomials with complex roots. Indeed, if 
we divide the polynomial p(z) by (z — c) we 
obtain 

p(z) = (z—-c)q(z) +d 
with d = p(c). If c is a root of p(z) we have obtained the factorization p(z) = 
(z — c)q(z). Applying the same procedure to q(z), and repeatedly to the resulting 


polynomials, a factorization of p(z) into linear factors (z — c) is obtained. For our 
polynomial z” — 1 we thus get 
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g-1= (z-1)-@- ernst) (z— en 2in/T) 


: (z _ gens) . (z _ Case ‘ (z 7 emery : (z _ eet), 


or, in general, 


(5.1) Theorem (Euler 1748, Introductio, Chap. IX). For n odd we have 


(n—1)/2 
2 _-1= (z _ 1) II (z = evens (2 = gr nr 
k=1 
(n—1)/2 
2k 
=(2-1) [J (@?-22c0s—* +1). 
n 
k=1 


(5.24) 


Proof. The first identity is the factorization derived above. The second one is ob- 
tained with the help of Eq. (5.13). 


By replacing z — z/a in (5.24) and multiplying by a” we obtain a slightly 
more general result: 


Wag Qhr 
(5.25) z” —a” = (z-a) II (2? — 2az cos — + a”). 
n 
k=1 


We now insert z = (1+ 2/N), a = (1 — x/N) into (5.25) and put n = N. This 
gives 


r\N xr\N 
8)" (8) 
( TN N 
a pets 2(1 cos mi 
; N?2 N?2 N 
2 


22 Qk x 2k 
cg U 2( (1 cos 57) + Fa(1 + 0087) ) 


3 “Tl 1. 22 Lt c0s(2km/N) 
= “v- SS ded 
” Pa N21 —cos(2kn/N) 


2|8 


Since the coefficient of x in the polynomial (1 + 2/N)% — (1 —2/N)% equals 
2 (see Theorem 2.1), we have Cy = 2 for all N. For large N the left-hand side 
of the above formula becomes e” — e~* (Theorem 2.3) and, using the fact that 
cosy © 1 — y?/2 for small y, the kth factor in the right-hand side tends to 


re 
(1 rh k2 72 ) : 


Therefore, we obtain 
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et —e* ze a x 
op SE ee a Ge) 
029) 2 oa eat 1? = An? a Qn? 


Since there are infinitely many factors, care has to be taken with this limit (for a 
justification see Exercise III.2.5). 


Replacing x by 2x, we find the desired function sin x to the left. Thus we 
have obtained the following famous formula in a more credible way. 


(5.2) Theorem (Euler 1748, §158). The function sin x can be factorized as 


2 


. ai x? x x? x? 
se 2 age) 2 a) age) ga) 


The convergence of this product is illustrated in Fig. 5.5. We observe that the 
convergence is better for smaller values of ||. 


FIGURES.5. Convergence of the product of Theorem 5.2 


Wallis’ Product. We put « = 7/2 in the formula of Theorem 5.2. This gives 


wietek (Dee) Cow) 
=FO-A G+) G-DO+DO-e+g)- 
AME, Td 35 5 #7 
Toe. 42 2 4 mn 6 r 
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and we obtain the famous product of Wallis (1655), 


(5.27) 


Remark. The original proof by Wallis starts from the fact that 7/2 is the area be- 
low (1 — «?)!/? (between —1 and +1), followed by a complicated procedure of 
interpolation based on the known areas for (1—7?)°, (1—x?)1, (1—2?)?,.... Pre- 
cisely this idea inspired Newton in his discovery of the general binomial theorem 
as discussed in Sect. 1.2. 


Exercises 


5.1 


5.2 


(Euler 1748, §185.) Set x = 7/6 in the formula of Theorem 5.2 and obtain, 
with the help of sin(a/6) = 1/2, another product for 7/2: 


am 3 6-6 12-12 18-18 24-24 ; 
2 5-7 11-13 17-19 23-25 — °° 


(5.28) —= 


then insert x = 7/4, multiply the obtained product by Wallis’ product, and 
obtain the following interesting formula: 


(5.29) 


(Euler, Introductio 8166, 168). Generalize (5.19) and (5.21) in the following 
way: let 


1+ Ayz+ Agz? + Agz? +...= (1+ .072)(1 +-02z)(1 +-32)-... 
(here z stands for x? in Theorem 5.2), and define the sums of the powers 


Si =ayt+ag+azg+... 


Sop ADE 8nd 
Sg=ajytazg+ag+... 


= 43 3 3 
S3=ajtaz+ag+..., 


and so on. Then, present a “demonstratio gemina theorematis Neutoniani” 


S, = A 

So = A, S1 — 2A 

S3 = A, S2 — AgS) + 3A3 

S4 = A,S3 — AgS2 + A3S1 — 4A4 


(5.30) 


and deduce from these formulas and from Theorem 5.2 the following sums: 


5.3 


5.4 


5.5 


5.6 
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1 1 = FO PTE 1 
22 32° 42 ~ 6 2-2! 6 
fh evade i DP A 
"94" 34° 44 ~ 90 2.41 30 
5.31 
on Ree a een eee ee 
26° 36 ° 46 °°" 945 «2-6! 42 
1 1 1 78 Bre 1 
l+—4+—54+54...=—— = =. 
28 38 «48 9450 2-8! 30 


Remark. Actually, Euler wrote these expressions a little differently, and 
the connection with the “Bernoulli numbers” (see Sect. II.10 below) be- 
came clear to him only a couple of years later (1755, Institutiones Cal- 
culi Differentialis, Caput V, §124,125,151, “ingrediuntur in expressiones 
summarum...’’). 


(Euler 1748, 8169). Show, either by a proof similar to the preceding one 
(starting from the roots of z” + 1 = 0), or by using cos x = sin 22/(2sin x), 
that 


= 4x? 4x? 4x? An? 
= 1-5) = (1-S)0- SS) 0-5). 
he II ( (2k — ln? 7 On? Dre 


Obtain by using this product such expressions as 


; dt ve. sll: 27h ne 
Tag Tt ppt ap te He 

‘ i. ok. Et _ 
34° 54 74 96" 


Show that (5.32) can also be obtained directly from (5.31). 


(Euler 1748, 8189-198). Take the logarithm of the formula of Theorem 5.2 
(which transforms the product into a sum) and derive ingenious ways of com- 
puting 
In(sin(x)) 
by using the expansions (5.31). 
Using Cardano’s formula (1.14) compute all roots of 
(5.33) a —5a+2=0. 
In spite of the fact that all three roots are real, one has to compute the cube 


roots of a complex number. 


Simplify the computation of the roots of (5.33) by the following idea (Viéte 
1591a): set 2 = jscosa@ and replace cosa by x/, in the identity cos3a = 
4 cos* @ — 3cosqa in order to get 
3 2 3 
o = 2 c cos 3a = 0. 


Compare this equation with (5.33) to obtain ju, a, and x. 
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The theory of continued fractions is one of the most useful theories in Arith- 
metic ... since it is absent from most works on Arithmetic and Algebra, it 
may not be well known among geometers. I would be satisfied if I were 
able to contribute to make it slightly more familiar. 

(Lagrange 1793, Oeuvres, vol. 7, p. 6-7) 


We say therefore; that the Circle is to the Square of the Diameter, as 1 to 
9 49 \. 81 


Ix ix x Bx 2x ke, infinitely. Or as | to 
1 
1+ 
2+ 75 
2+ 
49 
2+ 31 

2 


+ ——————— 
2+ &c, infinitely. 
How these Approximations were obtained ... would be too long here to 
insert; but may by those be seen, who please to consult that Treatise. 
(J. Wallis 1685, A Treatise of Algebra, p. 318) 

After having seen the use of infinite sums and infinite products in analysis, we 
now discuss a third possibility of an “infinitorum” process, infinite quotients, i.e., 
continued fractions. 


Origins 


The Euclidean Algorithm. This algorithm for the computation of the greatest 
common divisor of two integers has been known for more than 2000 years (Euclid, 
~ 300 B.C., Elements, Book VII, Propositions | and 2). Let two positive integers 
be given, for example 105 and 24. We divide the larger by the smaller and obtain 
the quotient 4 with remainder 9, ie., 


105/24 = 4+ 9/24. 
We now continue the process with the divisor and the remainder: 
24/9=24+6/9, 9/6=1+3/6, 6/3=2. 


The algorithm must stop, since the remainders form a strictly decreasing sequence 
of positive integers. The last nonzero remainder (here 3) is the greatest common 
divisor we were looking for, and by combining successive steps we get 

105 1 


4 + 


(6.1) a — 


Irrational Numbers. If this form of the Euclidean algorithm (repeatedly subtract 
the integer part and inverse) is applied to an irrational number, it cannot terminate, 
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since a finite expression as in (6.1) must be rational. For example, with a = \/2 
we obtain 
1 1 


1.4142...=1+ 0.4142 = 1 + ———— = |] + ———____.. 
7 +3412... + 2404122... 


The reappearance of the digits of \/2 in the last quotient is no surprise, since /2 
satisfies precisely a = 1+ 1/(1+ q) (multiply by 1 + a to see this). Continuing, 
we obtain the following formula of Bombelli from 1572: 


V2 = 1 + —_—_—_. 
(6.2) 2+ i 
De 
i ee 
The simplest of all sequences is obtained from the “golden mean’, which gives 
NS erent pe aS, eae 
2 1.61803 1 
(6.3) 1+ 
1 
1+ 
1+. 
Further examples are as follows: 
eS ——— 
1 1 
1+ i 1+ i 
2+ 2+ 
Let 5.8 1 
1+ 
(6.4) a 1 
4+ : i 
1+ 7 
1+ 
6+ 
e-1 = 1 eaet oe 1 
et+1 1 1 
I ares 
(6.5) —— 154+ 
1 1 
10+ ia 1+ i 
— 292 + 
1+ 


The quotients 1, 1, 2,1, 2, 1,2,1,... which appear for J3 are periodic, those 
for e and for (e — 1)/(e +1) also exhibit a regular behaviour. We shall explain this 
below for (e — 1)/(e + 1), which is tanh(1/2) (c.f. Eq. (6.31) below). However, 
the regularity for e is trickier (see Hurwitz, Werke 2, p. 130). No regularity at all 
appears for the quotients of 7, even if we compute thousands of them (Lambert 
(1770a) computed 27, Lochs (1963) computed 968). 


Lord Brouncker’s Fraction for 7/4. One year after the discovery of Wallis’ 
product for 7, Lord Brouncker succeeded in transforming it into an interesting 
continued fraction (see the quotation above and Eq. (6.23) below). This result in- 
spired Wallis to include a theory on continued fractions on the last two pages of 
his Arithmetica Infinitorum (1655, see Opera, vol.1, p. 474-475). 


70 I. Introduction to Analysis of the Infinite 


Lambert’s Continued Fraction for tan x. 
But the incentive for seeking these formulas came from Eulers Analysis 
infinitorum, where the expression ... appears in the form of an example. 
(Lambert 1770a) 
As we have seen in Sect. 1.4, the function tanz = sinx/ cos does not have a 
particularly simple expansion into an infinite series. We start from 


snz «¢-—2°/64+2°/120-... « 

cosx) 3 1l—2?/24+a4/24—...  1—a47/2+24/24-...° 
1—2?/6+24/120-... 

For x — 0, the denominator tends to 1. We therefore subtract 1 and obtain 


_ _ 
pane e/3—e/30+.... | 2 
~ L—2?/6+24/120—... [a /6+... 
132 304 ce: 


Here, for x — 0, the last denominator tends to 3. Subtracting 3 we then obtain 


tan xz = 
ae 


L— 


xr 


pea 
Continuing like this, we find that the subsequent denominators are 5, then 7, and 


so on. For an 18th century man (Lambert 1768) there is then no doubt that the 
following formula is true in general: 


(6.6) is ae 


A couple of decades later, Legendre (1794) gave a complete proof (see Exercise 
6.6). 


An expression of the type 
(6.7) qo + #—____., 


D3 


q2 + ——— 
q3+... 


is called a continued fraction. The fractions p1/q1, p2/q2, p3/q3,--- are called the 
partial quotients of the continued fraction. If all p, = 1, the continued fraction is 
called regular. 
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Convergents 


If the continued fraction (6.7) is truncated at its kth quotient, we obtain a rational 
number 


(6.8) ——— —— ——r 


p2 
qa + 


Pk 
gat...+t— 
qk 


which is called the Ath convergent of the continued fraction. We want to write 
these rational numbers as quotients of two integers. The first cases are easy: 


Pr. goa +pi 


(6.9a) go+—= 
71 71 
(6.9b) Qe = ae Tae 
P2 q1q2 + p2 
qa + — 
q2 


Let A; denote the numerator, and B;, the denominator, when the expression (6.8) 
is evaluated in this manner. From (6.9) we have 


Ao = 40, Bo = 1, 
Ay = qu + Pi, Bi =H, 
Az = qg2 + dope +Pig2, Bo=q192 + pe. 


We now look at these formulas, as Euler says, “with a bit of attention” (tamen 
attendenti statim patebit), and discover the following beautiful structure: 


(6.10) Az = q2Ai + p2Ao, Bz = q2B, + p2Bo. 
For the computation of A3 and B3, whose quotient must be 


Pl 
do + a os 
2 

a+ —=—- 
gz + p3/q3 


we could, by comparing with (6.9b), take the formulas for Az and By and replace 
everywhere qo by the quantity g2 + p3/q3. But the expressions obtained in this 
manner would in general not be integers. We therefore multiply both numbers by 
q3, which does not alter their quotient, and have from (6.10), 


A3 = (( + ps/q3)A1 + p2Ao ) "43; B3 = ((e + ps/q3)Bi + p2Bo) - q3- 
These two expressions become, after simplification, 
A3 = q3A2 + p3A1, B3 = Bo + psBi. 


This structure now repeats again and again and we have 
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(6.1) Theorem (Wallis 1655, Euler 1737b). The numerators and denominators of 
the convergents (6.8) are determined recursively by 


Ap = qe Ar—1 + PrAk—2 


(6.11) Br = qe Br-1 + perBr_2 
with 
(6.12) A_,=1 Ag = qo Ai =qmq0+ 71 


B_,=0 Bo=1 By =. 


(6.2) Examples. Equations (6.11) and (6.12) applied to the above examples lead 
to sequences of rational numbers, 


1+V5_ 1 2 3 5 8 13 21 34 55 89 144 
Be OO a 3 5 Bet 18 OL Bh 5 BO 
Viwi, 3,7 17 41 99 239 577 1393 3363 
1’ 2’ 5’ 12’ 29’ 70’ 169’ 408’ 985 ’ 2378’ 
Viwl, 2,57 19 26 TL 97 265 362 989 1351 
1’ 1° 3’ 4’ 11’ 15° 41’ 56’ 153’ 209’ 571’ 780 ’°"” 
ew 2,38 M19 87 106 193 1264 1457 2721 
1.09 352A PF BB) 39 1 465." 536° * 100 f° > 
_ 3 22 333 355 103993 104348 
BOTT 106" 113 * 33102.” 93215." ? 


which (see Fig. 6.1) rapidly approach the original irrational numbers. 


ee 
ER ANS 
E oe hee 
ak Soe Ses 
ae a 
| a a 
d A Sees 
10°F Fe 2 
E \\ v3 
E Xe 
10° = NS 
E : =—e 


FIGURE6.1. Errors for convergents A; /B; (logarithmic scale) 
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The approximations for \/2 and \/3 were known in antiquity (Archimedes 
used 265/153 < V3 < 1351/780 without further comment). The two conver- 
gents 22/7 (Archimedes) and 355/113 (Tsu Chung-chih around 480 in China, 
Adrianus Metius 1571-1635 in Europe) for 7 are of a better than average qual- 
ity. Explanation: the first denominator q;,41 to be neglected is large (15, respec- 
tively, 292). Two other very precise approximations for 7, which are the 11th and 
26th convergents respectively, have been calculated 1766 in Japan by Y. Arima 
as 5419351/1725033 and 428224593349304/136308121570117 (see Hayashi 
1902). On the other hand, for the golden mean (all q, = 1) we have slow con- 
vergence. Here, (6.11) becomes the recursion formula for the Fibonacci numbers 
(Leonardo da Pisa 1170-1250, also called Fibonacci). 

Some convergents of the continued fraction (6.6) for tan x, 

(6.13) 
x 3x 152 — x 105x — 1023 9452 — 10523 + 2° 
1’ 3-22’ 15-622’ 105— 4522+ 24’ 945 — 420x? + 1524?" 


are displayed in Fig. 6.2 and nicely approach the function tan x, even beyond the 
singularities x = 7/2, 37/2,.... 


FIGURE 6.2. Convergents of the continued fraction for tan x 


Infinite Series from Continued Fractions. The difference of two successive con- 
vergents satisfies 


(6.14) 


Arsi Ak = Apsi Be — ApBr+i = (-1)F. P1p2*.-.* Dk+1 
Bryr Br By Be41 BeBros 


The last identity is seen as follows: using (6.11) we have 


Apsi Br — ApBrsi = (de41 An + pr+1Ap—1)Br — Ag (desi Br + peri Br_1) 
= —pr+i(ApBg—-1 — Ap-1 Be) =... 
= pz+...+ Peti(—1)"(A1Bo — Ao Bi) 


and (A; By — Ap Bi) = pi because of (6.12). Writing the convergent A;,/B, as 
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Ap (= Se) (es - 2) 4..4(2 =) Ao 


Be \Br Bra By-1— Br-2 Bo’ 
we see from (6.14) that 


Ax, Pi = PiP2 =~, ~Pipeps k—-1 PIP2+ +++" Dk 
6.15) —= — — —— _... —1 a 
Cp OG. Be BB Bri Bs 


and we have 


(6.3) Theorem. The convergents of (6.7) are the truncated sums of the series 


Pl Pip2 P1P2P3 P1P2P3P4 
(6.16) Og BBs) BeBe: BoB, 


For regular continued fractions (all py, = 1) we have 


1 1 1 1 
6.16! BE ay SO Pee a a 
(ote) Ave > BBs BBs Baby © 


Since 1/(By,—1 By) is the smallest possible distance between two different rational 
numbers with denominators B,_; and Bx, the interval between A,_1/B,—1 and 
A;,/ By cannot contain a rational number whose denominator is not larger than 
Bg. 


Continued Fractions from Infinite Series. Let 
(6.17) ala Oe eee ee 


be a given series with integer c;; we want to find integers p,;, q; such that the series 
(6.17) coincides term by term with (6.16) (with go = 0). 


Solution. We put p; = 1 and q; = B, = c,. Then, we divide two successive terms 
of (6.16) (so that the products of p; simplify), which gives 


(6.18) Cr-1 Br = CepprBp-2. 


This resembles, apart from the factors c,_1 and cx, the Eq. (6.11). We therefore 
subtract from (6.18) Eq. (6.11), once multiplied by c,_1, once by cx, and obtain 


Ck—-19k Be-1 = (Ck — Ch-1)PkBr-2 
(ch-1 — Ch) Be = —ChdkBr-1. 
In the first formula we replace k by k + 1 and then divide the two expressions. 


This eliminates the B;,,’s and gives 


(6.19) CORY _ (Ch — Ch)P RAL 


Ck — Ck-1 Ckdk 
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The p,;, q; are, of course, not uniquely defined. Since we want them to be integers, 
a natural choice that satisfies (6.19) is 


(6:20) Pk = Ge Gk+1 = Ck+1 — Ck 


for k > 1. Thus, we have the following formula of Euler (1748, §369): 
(6.21) 


1 1 1 1 1 
—= —-—t...= 5 
C1 C2 C3 C4. Cy 
ci + 5 
C9 
cg — C1 + 5 
C3 
c3 — C2 + 
C4 — C3 + 


In2=1 Ue Bode 
2 3 4 1 
1+ 
a 
(6.22) 9 
1+ 16 
1+ 
1+. 
T 1 11 s 1 
47 35 i 1 
1+ 
en 
6.23 
(6.23) ae 
2 
ore 


The second continued fraction is the one found by Lord Brouncker, obtained here 
from Leibniz’s series. 
Similarily, we prove (Euler 1748, §370) 


(6.24) 
1 1 1 1 
poe a i SS 
Cy C1 C2 C1 C2C3 Cy 
ey + — 
C2 
cg —1+ 
hone? 
CR 
: c4—-1+.. 
whence, for example, 
1 1 1 1 
Tete Tae Toe ee 
1+ 
fac 
6.25 
( ) 2+ Z 
3+ 
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Trrationality 


I have good reason to doubt that the present article will be read, or even 
understood, by those who should profit most by it, namely those who spend 
time and efforts in trying to square the circle. There will always be enough 
such persons ... who understand very little of geometry .. . 
(Lambert 1770a) 
One of the great unsolved problems of classical analysis was the quadrature of 
the circle (i.e., the construction of 7) by ruler and compass. Lambert was one of 
the first to believe that this construction, which challenged mathematicians for 
2000 years, was impossible. A first hint toward this result would be the fact that 
7 is irrational. We are therefore interested in a theorem that states that an infinite 
continued fraction represents an irrational number. 


First difficulty. It can happen that a continued fraction represents no number at all. 
To see this, we start from the series 


oS a hese 
2 Sea ee 
26) [ 2° S. 4°56 


Since its terms approach +1, it clearly does not converge. To obtain a correspond- 
ing continued fraction, we put c, = k/(k+1) (see (6.17)) and obtain from (6.19), 
after simplification, 


PEH _ _ 43(k +2). 
dk+1 ° dk 


With pri = k3 (k + 2) and qx = 1 we have integer coefficients and see that the 
convergents of the continued fraction 


(6.27) 


do not tend to a real number. 


Second difficulty. There are infinite continued fractions that represent a rational 
number. For example, we have 2 = 1+ 2/2 and obtain, by inserting 2 repeatedly, 


(6.28) pe a, 


which is rational. 


(6.4) Theorem. /f the p; and q; are integers and if from a certain index j > jo 
onward 


(6.29) O<pji <q, 


then the continued fraction (6.7) tends to a number « that is irrational. 
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Proof. Without loss of generality we may assume that 0 < p; < q; is satisfied for 
all 7. Otherwise, we consider the continued fraction starting with p,, /q;,. Conver- 
gence of this continued fraction and its irrationality are equivalent to convergence 
and irrationality of the original one. 

The assumption that 0 < p; < qj; guarantees that the convergents of the 
continued fraction tend to a real number. This is a consequence of the “Leibniz 
criterion” and will be discussed in Sect. II.2. 

Following an idea of Legendre (1794, Eléments de Géométrie, Note IV), we 
now write the continued fraction (6.7) without go as 


; p2 
with B= 
at D3 


q2 + 
Q3 7Post: 


(6.30) a= 


Since q; > p; and G > 0 we have a < 1. Suppose now that a = B/A is rational 
with 0 < B < A.A simple reformulation of (6.30) yields 
p-nma Ap —-Ba 


ag I ee 


so that (@ is expressed as a rational number with denominator smaller than that 
of a. If we repeat the same reasoning with 6 = po/(q2 + y) and so on, we find 
smaller and smaller denominators that are all integers. This is not possible an 
infinite number of times. 


Negative pj. The conclusion of Theorem 6.4 is also valid, if (6.29) is replaced by 
(6.29’) 2|p;| < qj — 1. 
This is seen by repeated application of the identity (valid for p; < 0) 


qj B 


1+ 


Qj-1 + 
Ip;| 


q — \pi| + 8 

which, under the assumption (6.29’), transforms the continued fraction into an- 
other one satisfying (6.29). 

(6.5) Theorem (Lambert 1768, 1770a, Legendre 1794). For each rational x (x # 


0) the value tan x is irrational. 


Proof. Suppose that z = m/n is rational and insert this into (6.6): 


m min m 
6.31 tan — = ——_,—_,——__ = 
en) oe m? /n? m? 
os m? /n? i m 
3 m/e 3n — ae 
5— bn — 
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On the right we have a continued fraction with integer coefficients. Since the fac- 
tors 1,3,5,7,9,... approach infinity, condition (6.29’) is, for all m and n, satisfied 
beyond a certain index 29. 


The same result is true for the arctan function; indeed, for y rational, 7 = 
arctan y must be irrational, otherwise y = tana would be irrational by Theo- 
rem 6.5. In particular, 7 = 4 arctan 1 must be irrational. 

The proof of the analogous result for the hyperbolic tangent tanha = 
(e” — e~*)/(e*7 +e7*) = (e?” — 1)/(e?” + 1) is even easier, since all mi- 
nus signs in (6.31) become plus signs. Inverting the last formula, we have e” = 
(1+ tanh(2/2))/(1 — tanh(2/2)), and still obtain the irrationality of e” and In x 
for rational x 4 0 and x # 1, respectively. 


Exercises 


6.1 Show that with the use of matrix notation, the numerators and denominators 
A;, and By, of the convergents (6.8) can be expressed in the following form: 


G eG (2 te 7 te ) @ 7 
By Br-1 1 O/ \pr OF \po2 OF \pr-1 0) \pr OF” 
6.2 Compute numerically the regular continued fractions for the numbers 

V2 VB. NBs VE, NT 25 V8 VA BG, AT 


and discover a significant difference between the square and the cube roots. 
6.3 Show that 


are solutions of a second-degree equation. Compute their values. 
6.4 The length of an astronomical year is (Euler 1748, §382) 


365 days 5 hours 48’55”. 


Compute the development of 5 hours 48’55” (measured in days) into a reg- 
ular continued fraction and compute the corresponding convergents. Don’t 
forget to give your valuable advice to Pope Gregory XIII for the reform of 
his calendar. 


6.5 Give a detailed proof of Eq. (6.24). 
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6.6 Prove formula (6.6). 
Hint (Legendre 1794). Define 
3 


(z)=14+— + ze + = + 
WW oT dag te tepiaa 1) ¢ Iles eed Ieee) 
y(z + 2) . Next, define 


and show that y(z) — y(z+1)= 2(z+1) 
a 


a: p(z+1) 

6.33 z) = ————_ | such that z) = ————_.. 
633) (2) = TE v@)= 65D 
Iterating (6.33) leads to a continued fraction. Finally, put a = «x? /4 so that 
yp(1/2) = cosha and ry(3/2) = sinh, and replace x by iz. We note that 
these formulas are related to continued fractions for hypergeometric func- 


tions (Gauss, Heine, see Perron 1913, p.313,353). 
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Differential and Integral Calculus 


LECTIONES 
ANALYSIS I 


CAPUT I ee 
INSTITUTIONES CALCULI 
(| DIFFERENTIALIS ET INTEGRALIS 


The extent of this calculus is immense: it applies to curves both mechanical 
and geometrical; radical signs cause it no difficulty, and even are often con- 
venient; it extends to as many variables as one wishes; the comparison of 
infinitely small quantities of all sorts is easy. And it gives rise to an infinity 
of surprising discoveries concerning curved or straight tangents, questions 
De maximis & minimis, inflexion points and cusps of curves, envelopes, 
caustics from reflexion or refraction, &c. as we shall see in this work. 

(Marquis de L’ Hospital 1696, Introduction to Analyse des infiniment petits) 


This chapter introduces the differential and integral calculus, the greatest inven- 
tions of all time in mathematics. We explain the ideas of Leibniz, the Bernoullis, 
and Euler. A rigorous treatment in the spirit of the 19th century will be the subject 
of Sections III.5 and III.6. 

As we see in the above illustration, this calculus sheds light on the obscure 
machinery of scientific research. 
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I.1 The Derivative 


And I dare say that this is not only the most useful and most general prob- 
lem in geometry that I know, but even that I ever desired to know. 
(Descartes 1637, p. 342, Engl. transl. p. 95) 


Isaac Newton was not a pleasant man. His relations with other academics 
were notorious, with most of his later life spent embroiled in heated dis- 
putes ... A serious dispute arose with the German philosopher Gottfried 
Leibniz. Both Leibniz and Newton had independently developed a branch 
of mathematics called calculus, which underlies most of modern physics 
... Following the death of Leibniz, Newton is reported to have declared 
that he had taken great satisfaction in ‘breaking Leibniz’s heart’. 
(Hawking 1988, A brief history of time, Bantam Editors, New York) 


What contempt for the non-English! We have found these methods, without 
any help from the English. 
(Joh. Bernoulli 1735, Opera, vol. IV, p. 170) 


What you report about Bernard Niewentijt is just small beer. Who could 
refrain from laughing at his ridiculous hair-splitting about our calculus, as 
if he were blind to its advantages. 

(Letter of Joh. Bernoulli, quoted from Parmentier 1989, p. 316). 


We shall call the function fx a primitive function of the functions f'x, f’’x, 

&c. which derive from it, and we shall call these latter the derived functions 

of the first one. (Lagrange 1797) 
Problem. Let y = f(a) be a given curve. At each point z we wish to know the 
Slope of the curve, the tangent or the normal to the curve. 


Motivations. 

— Calculation of the angles under which two curves intersect (Descartes); 

— construction of telescopes (Galilei), of clocks (Huygens 1673); 

— search for the maxima, minima of a function (Fermat 1638); 

— velocity and acceleration of a movement (Galilei 1638, Newton 1686); and 
— astronomy, verification of the Law of Gravitation (Kepler, Newton). 


The Derivative 


The Linear Function y = az + 0. In ad- 
dition to the fixed value x, we consider the 
perturbed value x + Az. The correspond- 
ing y-values are y = ax+band y+ Ay = 
a(x + Ax) + b, hence Ay = aAz. The 
slope of the line, defined by ae is equal 
to a. Fig. 1.1 shows functions y = ax + 1 
for different values of a. 


a =2 a =1 a=1/2 ae a =-1/2 a =-1l a 
FIGURE 1.1. Slopes in dependence of a 
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The Parabola y = x”. If x increases by Az, then y increases to y + Ay = 
(x + Ax)? = 2? + 2xAx + (Ax)? so that (see Fig. 1.2a) 


(1.1) Ay = 2xAz + (Azx)?. 


Therefore, the slope of the line connecting (x, y) with (a + Az, y + Ay) is equal 
to 2a + Ax. If Ax tends to zero, this slope will approach that of the tangent to the 
parabola. 


oor 


FIGURE 1.2a. Tangent to parabola FIGURE 1.2b. Tangent to parabola (Draw- 
ing of Joh. Bernoulli 1691/92)! 


Leibniz (1684) imagines that Ax and Ay become “infinitely small” (“tangentem 
invenire, esse rectam ducere, que duo curve puncta distantiam infinite parvam 
habentia, jungat, ...”) and denotes them by dz and dy. Then we neglect the term 
(dz)?, which is “infinitely smaller” than 2adz, and obtain, instead of (1.1), 


dy _ 


(1.1) dy = 2x dx or 7 
xv 


22. 

Newton (1671, pub. 1736, p. 20) considers his variables v, x, y, z “as gradually and 
indefinitely increasing, ... And the velocities by which every Fluent is increased 
by its general motion, (which I may call Fluxions, ...) I shall represent by the 
same Letters pointed thus v, x, y, 2”. Their values are obtained by “rejecting the 
Terms . . . as being equal to nothing”. Newton categorically refused the publication 
(“Pray let none of my mathematical papers be printed whout my special licence’). 


Jac. and Joh. Bernoulli re-invent the differential calculus a third time, based on 
Leibniz’s obscure publication from 1684 (“une énigme plutét qu’une explica- 
tion’). Joh. Bernoulli (1691/92) then gave private lessons on the new calculus to 
the very noble Marquis de L’ Hospital. For him, infinitely small quantities are just 
quantities that can be added to finite quantities without altering their values and 


' Reproduced with permission of Univ. Bibl. Basel. 
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curves are polygons with infinitely short sides. Furthermore, this greatest of all 
teachers (besides his numerous sons and nephews and de L Hospital, he also in- 
troduced Euler to mathematics) held the opinion that too many explanations on 
the infinitely small would rather trouble the understanding of those who are not 
“accoutumés a de longues explications”. 


B. Nieventijt gives in 1694 a first criticism of the infinitely small (see the letter of 
Joh. Bernoulli quoted above), followed by a “Responsio” of Leibniz (in the July 
1695 issue of the journal Acta Eruditorum). 


Marquis de L’ Hospital (1696) writes the famous book Analyse des infiniment pe- 
tits (see Fig. 1.3), which leads to the definitive breakthrough of the new calculus, 
even in France, where science was governed for many decades by the “Cartesians” 
(abbé Catelan, Papin, Rolle, ...). 


A P 


FIGURE 1.3. Drawing from de L’ Hospital (1696), Analyse des infiniment petits 


Bishop Berkeley published the polemic article The Analyst in 1734 against the 
infinitely small (see the quotation in Sect. II.2 and Struik 1969, p. 333). 


Maclaurin (1742, Treatise of Fluxions, vol. I, p.420): “... investigate the ratio 
which is the limit .. .” 


Euler (1755, Institutiones Calculi Differentialis) starts with two long chapters De 
differentiis finitis and De usu differentiarum in doctrina serierum, followed by 
six pages in latin on the infinite, before daring to write “denotet dx quantitatem 
infinite parvam” (dx = 0 and adx = 0), but requires that “ratio geometrica 
adz — 4 grit finita”. He favors Leibniz’s notation against Newton’s by saying 


dx ~ 1 
that “... incommode hoc modo y repraesantur, cum nostro signandi modo d!°y 


facillime comprehendatur”’. 


D’Alembert (1754, Encyclopédie) introduces a clear notion of the limit (“This 
limit is the value which the ratio z/n approaches more and more ... Nothing is 
clearer than this idea; ...”). 


Lagrange (1797) rejects the infinitely small straightaway and tries to base anal- 
ysis on power series (“One knows the difficulties created by the assumption of 
infinitely small quantities, upon which Leibniz constructs his Calculus.”) He in- 
troduces the name derivative and uses for dy/dx the notation (see quotation) 


> Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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(1.2) y or f'(x). 


Cauchy (1823) condemns the Taylor series (counterexample y = e~'/ a 


Sect. III.7 below) and reestablishes the infinitely small as a limit. 


see 


Bolzano (1817) and Weierstrass (1861) bring the notion of limit to perfection with 
€ and 6 (see Chap. III). 


F. Klein (1908) defends the educational value of the infinitely small (“The force of 
conviction inherent in such naive guiding reflections is, of course, different for dif- 
ferent individuals. Many — and I include myself here — find them very satisfying. 
Others, again, who are gifted only on the purely logical side, find them thoroughly 
meaningless .. . In this connection, I should like to commend the Leibniz notation 


sae) 


Differentiation Rules 


His positis calculi regulae erunt tales: 
(Leibniz 1684) 


Sums and Constant Factors. Let y(x) = a- u(x) + b- u(x), where a and b 
are constant factors. Setting y + Ay = y(a# + Ax), u+ Au = u(a + Ax), 
v + Av = v(a + Az), we have 


Ay =a-Au+b- Av 


and we get the differentiation rule 


d 
(1.3) |y=autby => S=a-—+b-— or y! =au' +b’. 
x 


Products. For the product of two functions y(a) = u(x) - u(x) we have 


y + Ay = u(x + Ax)- v(x + Az) 
= (u+ Au): (vu + Av) = uv +uAvt+v Aust AudAr, 


which leads to dy = udv + udu “because du dv is an infinitely small quantity 
when compared to the other terms u du & vu du” (de L-Hospital 1696, p. 4) or 


(1.4) oe dy aU du 
; =uU-v — =u—+v— 
z dx dx dx 


or y/ =wot+uv’. 


Examples. We write x? as a product y = «° = a? - x and the above formula yields 


y' = x? -1+2- 2x = 32”. Similarly, for the product y = «4 = x° - x we get 
y' = a3-1+a-32? = 42x. By induction, we see in this way that for any positive 
integer n 
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Quotients. For the quotient y(x) = u(x)/v(«) of two functions we have 


ut+ Au 


grays u+ Av’ 


Subtracting y on each side and using the geometric series for (1+ Av/v)~+ yields 
for v #0 


2 


A —u+tAu u_ vAu—udv— vAu—udv (1 Av , (Av)? : ) 
Yat dau yu +A v2 v ee 


UV 


Therefore, we have for v 0 


du dv / / 

u d Ue “Ua uv — Uv 
(1.6) pa Zs a ee eae oe ae em 
v dx v v 


Example. The function y = «~” = 1/2” is the quotient of u = 1 and v = 2”. By 
applying (1.6) we get 


| dy _ —ngr—! - 1 
oe dg ga eT OO 


This is Eq. (1.5) for negative n. 


slope = 1/2 


FIGURE 1.4. An inverse function 


Inverse Functions. Let y = f(x) be a given function and x = g(y) its inverse. 
Since the graphs are reflected in the 45° axis (Fig. 1.4), we have 


dy it 

A 1 OO are d 

(1.7) Sse J Had dx ds for “40, 
Ax 52 dy dy 
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Example. y = x'/? is the inverse function of 7 = y. Therefore, 


d 1 1 1 1 
— pl /2 as Die Be ae SR eR oe yD 
aed de by BJs 2” 


and Eq. (1.5) appears to be true for rational n. 


Exponential Function. For the exponential function y = e” (Sect. 1.2) we have 


y + Ay = e®+4* = ee“? and Ay = e™(e4* — 1). 


Using the series e4* = 1 + Ax + (Ax)? /2! +... (Theorem I.2.3) we therefore 
obtain 


/ 


(1.8) y=e => y =e”. 
The exponential function is its own derivative. 


Logarithms. There are several ways to compute the derivative of y = In. 
a) It is the inverse function of x = e¥. By (1.7), 


dy 1 1 1 


dx dx/dy © x 


(1.9) y=lnz => 
b) We can also compute Ay from y + Ay = In(a + Az) and obtain 


Ay = In(a + Az) — In(x) = In =In 


A A 17 Ax? 
With the series for In{ 1 + =) Se ( 7 
x 
obtain (1.9). 
Trigonometric Functions. Consider first y = sin x. Using Eq. (1.4.3) we get 
y + Ay = sin(a + Ax) = sinx cos Ax + cos x sin Az. 


With the series expansions for sin Ax and cos Az (see (1.4.16) and (1.4.17)) we 
obtain 


Ay = sina(—42" asth ) + cos( Ax - a + ) 
and consequently 
(1.10) y =sinz > y’ = cosa. 
Similarly, 
(1.11) y =cosx > y =—sing. 


For y = tanz = sina/ cos x we use (1.6) and obtain 
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d me pee 1 
(1.12) CO a EA TEE cs Psi cath oe 
dx cos? x cos? x 


Inverse Trigonometric Functions. As a consequence of (1.7) and the above for- 
mulas for the derivatives of the trigonometric functions, we have 


dy 1 1 1 
1.13 = t => 2 = = 
(1.13) y =arctane de delay Py ae 
dy i 1 1 
(1.14) = arcsina SEs oe Se, . 
y dx cos y iuseine y y JI — a2 
dy 1 1 


(1.15) y=arccosr > = 


=a = 
dx —siny ,/l—costy V1l—a? 


Composite Functions. Consider a function y = h(x) = f(g(a)) and let z = 
g(x). For the incremented values we have z + Az = g(a + Az) and y + Ay = 
h(a + Ax) = f(z + Az). From the trivial identity 


Ay Ay Az 
Ac Az Az 
it follows that 
dy dy dz Fk reel / 
(1.16) aia eh M@M=S (9): 9'@). 


In order to differentiate a composite function, one has to multiply the derivatives 
of the functions f and g. 


Zz 
y Ac y 


1B 4y if 1b 4y 


FIGURE 1.5. A composite function 


Example. The function y = sin(2x) is composed as y = sin z and z = 2a (see 
Fig. 1.5). By (1.16) its derivative is y’ = cos z - 2 = 2 cos(2z). 

Relying on these rules, the computation of the derivative of any function 
composed of elementary functions (Descartes’ great dream, see quotation at the 
beginning of this section) has become a banality. For instance, 
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=2-l d dy d 
y=at =erlna (z x Ina) a ena =Ina- a, 
= Oh salir (z =a-Inz) dy _ dy dz _ ge Al . gat 
eon ae zg ia vdeo ae 


Thus, we have Eq. (1.5) for any real number n. 


Parametric Representation and Implicit Equations 


We take as an example a curve of venerable age: the conchoid of Nicomedes (200 
B.C.). For two given constants a and b the conchoid is defined as follows: on any 
ray through the origin G the distance of a point A on the conchoid and the point F 
on a horizontal line of height a is of constant length b (see Fig. 1.6). 


FIGURE 1.6. The conchoid of Nicomedes 


The similarity of triangles FAB and FGL gives the relation 
y-a b 


a at + y—d 


which leads to 
(1.17) (y — a)?(x? + y*) = by. 


If we wanted to express y as a function of x from this equation, we would have to 
solve a polynomial equation of degree 4 for each x. We should try instead to work 
with the implicit equation (1.17) itself. 

Another possibility is to denote the angle LGF by y and obtain 


x=atany + bsin 
(1.18) is c 
y =a+t beosy. 


When y varies from —7/2 to 7/2, the expressions (1.18) then form a parametric 
representation of our curve. Such parametric representations are not unique. For 
example, we may also use the distance GF as parameter t (see Fig. 1.6). Then we 
obtain 
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x = (b/t+1) V2 —@ 


1.19 
vee y = (b/t+ la. 


This represents the right half of the curve when ¢ varies from a to oo. 
We now consider the problem of computing the tangent to the conchoid at a 
given point A (this is “Aufgabe 7” of Joh. Bernoulli 1691/92). 


Differentiation of the Parametric Equation. We consider y in the second equa- 
tion of (1.18) or (1.19) as a function of the parameter, and we interpret the param- 
eter as the inverse function of x of the first equation. Then we have by (1.16) and 
(1.7), 

d dy ad d d d dy jd 
(1.20) Se es “Gey eee 

dx dp dx dp dy dx dt/ dt’ 
Thank you, Leibniz, once again, for your notation. Differentiating the equations 
(1.19) and dividing the derivatives we obtain for the conchoid 


dy —abyt? — a? 
de t3+a2b * 
This formula allows a nice interpretation (Joh. Bernoulli 1691/92): denote by M 


the point such that triangles LGF and GMA are similar. Then, the tangent in A is 
parallel to the line connecting M and F (see Fig 1.6). 


(1.21) 


Implicit Differentiation. This method, already used by Leibniz (1684), consists 
of using the above rules to differentiate directly an implicit equation defining the 
function y(x) (in our example the equation (1.17)). This gives 


2(y — a) dy (a? + y°) + (y — a)?(2x da + 2y dy) = 2b’y dy 
and after division by 2dz, 


die. =e(y 4) 
nae de (yaya? +42) + (y ayy — By 


This implicit differentiation will be discussed more rigorously in Sect. IV.3. 


Exercises 


1.1 Extend the differentiation rule (1.4) to three factors 


yY=uUu-v-w > y =u -v-wtu-v-wt+u-v-w’. 


1.2 Compute the derivative dy/dx of 


5 sin(3x + bya? + €?*) - tan (Se) +S 
WS a ee 


3a2 23 Bea : 3x 
—— + + 5) : 3 
arccos = = T/a) e€ arcsin J=_2 


arctan( x 
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1.3 


1.4 


1.5 


1.6 


IL. Differential and Integral Calculus 
(An example of Euler 1755, §192). Show that if 
y=e then yo =e -e& -e®, 


Compute the derivative of the cis- 
soid of Diocles (about 180 B.C.). 
This curve, used by Diocles for 
solving the Delian problem of du- 
plicating the cube, is created by G 
the circle MCE as the set of points 
of intersection of the lines DM 
and BF, where the arcs BC and 
CD are equal. Show that the tan- 
gent at A is parallel to the line EH, 
where H is such that EF and GH 
are parallel. 


H 


Compute the derivative of the circle defined by x? + y? = r? by implicit 
differentiation as well as by solving for y followed by explicit differentiation. 


(Leibniz 1684). Compute the derivative of the function y(a) defined by 


x (a+bx)-(c— 22) yy 

= + + an/gg + yy + ——=———— _. = 0 

y (ex + fax)? oT ee Vhh + lx + maxx 
where a, b, c, e, f, g, h, 2, and m are constants. This equation does not rep- 
resent any ancient famous Babylonian or Egyptian curve and has no other 
particular interest either. It was just chosen by Leibniz as a horribly compli- 
cated expression in order to demonstrate the power of his calculus. 
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But the velocities of the velocities, the second, third, fourth, and fifth ve- 
locities, &c., exceed, if I mistake not, all human understanding. The further 
the mind analyseth and pursueth these fugitive ideas the more it is lost and 
bewildered; ... 

(Bishop Berkeley 1734, The Analyst, see Struik 1969, Source Book, p. 335) 


. our modern analysts are not content to consider only the differences 
of finite quantities: they also consider the differences of those differences, 
and the differences of the differences of the first differences. And so on 
ad infinitum. That is, they consider quantities infinitely less than the least 
discernible quantity; and others infinitely less than those infinitely small 
ones; and still others infinitely less than the preceding infinitesimals, and 
so without end or limit ... Now to conceive a quantity infinitely small ... 
is, I confess, above my capacity. But to conceive a part of such infinitely 
small quantity that shall be still infinitely less than it, and consequently 
though multiplied infinitely shall never equal the minutest finite quantity, 
is, I suspect, an infinite difficulty to any man whatsoever; .. . 

(Bishop Berkeley 1734, The Analyst) 


The Second Derivative 


We have seen in Sect. II.1 that for a given function y = f(a) the derivative f’ (a) 
is the slope of the tangent to the curve y = f(a). Therefore, if f’(x) > O for 
a <x < 6, the function is increasing on that interval; if f’(r) < Ofora<a2<b, 
it is decreasing. Points at which f’(x) = 0 are called stationary points. 


y’ >0 


Xo + Xo + 


FIGURE2. 1a. Geometrical meaning of the FIGURE2. 1b. A drawing of Joh. Bernoulli 
second derivative (1691/92)! 


Newton (1665) and Joh. Bernoulli (1691/92) were the first to study the ge- 
ometric meaning of the second derivative of f. We differentiate y’ = f’(a) to 
obtain y” = f(x). If f’ (a) > O fora < x < b, then f’(x) will be increasing, 
i.e., for two points x < x1 we will have f’(xo) < f’(x1). This means that the 
curve is steeper at x; than at xp and therefore is crooked upward (see Fig. 2. 1a, 
left). We then say that the function f(x) is convex downward. 

Similarly, if f’ (a) < 0 fora < x < b, the function f(x) is convex upward 
(see Fig. 2.1a, right). Points with f” (ao) = 0, where the second derivative changes 
sign, are called inflection points. Fig. 2.1b reproduces a drawing of Joh. Bernoulli 
explaining these facts. 


' Reproduced with permission of Univ. Bibl. Basel. 
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Problems “de maximis & minimis’’. 


I just wish him to know that our questions de maximis et minimis and de 
tangentibus linearum curvarum were perfect eight or ten years ago and that 
several persons who have seen them in the last five or six years can bear 
witness to this. 

(Letter from Fermat to Descartes, June 1638; Oeuvres, tome 2, p. 154-162) 


When a Quantity is the greatest or the least that it can be, at that moment 
it neither flows backward or forward. For if it flows forward, or increases, 
that proves it was less, and will presently be greater than itis. ... Wherefore 
find its Fluxion, by Prob. 1 and suppose it to be nothing. 
(Newton 1671, engl. pub. 1736, p. 44) 
The problem of finding maximal or minimal values was one of the very first moti- 
vations for the differential calculus (Fermat 1638) and was cultivated by Lagrange 
throughout his life (see Lagrange 1759). 

At a maximal or minimal value of a function f(x), this function can neither 
increase nor decrease. Hence we must have f’(ao9) = 0 (stationary point). It will 
be a (local) maximum if the sign of f’(x) changes from + to — (this is the case 
if f’ (wo) < 0) and a (local) minimum if it changes from — to + (this happens if 
f" (ao) > 0). We summarize this as 


(2.1) f'(vo) =0 and f”"(zo) >0 = zo isa local minimum, 
; f'(vo) =0 and f(z) <0 = a isa local maximum. 


These facts “sequentibus exemplis illustrabimus”’: 


Example 1. We choose 


y=oa* —a? — 3a, 
(2.2) y’ = 3a? — 2x — 3, 
y” = 6x — 2. 


The function can be seen to increase where 
y’ > 0, ie. for 2 < (1 — V10)/3 and for 
x > (1+ V10)/3. It is convex downward for 
x > 1/3 and convex upward for x < 1/3. The 
point « = 1/3 is an inflection point. The point 
x = (1 — V10)/3 is a local (but not global) 
maximum, the point 2 = (1 + V10)/3 is a 
local minimum. 


Example 2. We consider the function (see Euler 1755, Pars Posterior, §265) 


x 1-2? —6x + 2x3 
2.3 = —_ IS 5 fie SOR ee 
( ) 7] 142?’ y (1 + x?)2’ y (1 + 2?)8 ? 
which, together with its first and second derivative, is plotted in Fig. 2.2. The func- 
tion y(x) possesses a (global) minimum for « = —1, a (global) maximum for 
x = 1, and inflection points at x = 0 and x = +/3. It is convex downward on 


the intervals -V3 <a< Oand V3 < x < oo and convex upward elsewhere. 
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me 


FIGURE2.2. Maxima, minima, inflection points of Euler’s example 


Fermat’s Principle. 


FIGURE 2.3. Drawing by Joh. Bernoulli FIGURE 2.4. Fermat’s principle 
1691/92 


Fermat wishes to explain the law of Snellius for the refraction of light between 
two media in which the velocities are v; and v2, respectively. Let two points A, B 
(see Fig. 2.4) be given. Find angles a; and a2 such that light travels from A to B 
in minimal time or with minimal resistance. This means, find x such that 


Va? + x? a b2 + (€— 2)? 


U1 v2 


(2.4) T= 


= min! 


Fermat himself found the problem too difficult for an analytical treatment (“I ad- 
mit that this problem is not one of the easiest”). The computations were then 
proudly performed by Leibniz (1684) “in tribus lineis”. The derivative of T’ as a 
function of x is 


? Reproduced with permission of Univ. Bibl. Basel. 
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baal Qu —2(€— 2) 1 


= + — 
U1 QVJa*+ a2 2,/b? + (C— 2)? v2 


Observing that sina, = «/V/a? + x? and sinaz = (¢— x)/\/b? + (€ — x)?, we 
see that this derivative vanishes whenever 
sin a, sin a2 


(2.5) a 


UL U2 
(law of Snellius). The computation of T”, 


1 a? 1 b? 
= —— |, + — 9, 
v1 (a2 + 22)3/2 * uy (b? + (€— x)?)3/2 


v1 


shows that our result is really a minimum. 


De Conversione Functionum in Series 
Taylor’s Approach. 


We have here, in fact, a passage to the limit of unexampled audacity. 
(F. Klein 1908, Engl. ed., p. 233) 
We consider (Taylor 1715) for a function f(a) the points x9, 71 = %) + Az, ro = 
Xo +2Az,... and the function values yo = f(xo), yr = f (#1), Yo = f(x2),.... 


XO Ly v2 Lo Ly Xr vo LQ 
FIGURE2.5. Creation of the Taylor polynomial 


Then we compute the interpolation polynomial passing through these points (see 
Fig. 2.5 and Theorem I.1.2; for the latter we define x = a9 + tAz, t = =*) 


&— xo Ayo (x — x0)(x — £1) Ayo 
1 Ag 1-2 Ax?’ 


or with more such terms for higher degrees. If we let Ax — 0, 21 x0, L2 xo 
(or, as we said: if we take Az infinitely small), the quotient Ayo /Az in the second 
term tends to f’(ao). Further, the product (a — xo)(x — 21), which appears in the 
third term, will tend to (a — 29)”. It was then postulated by Taylor that the second 
differences (divided by Ax”) will tend to the second derivative (see Exercises 2.5 
and II.6.4); in general, 


(2.6) p(x) = yo + 


(2.7) 
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If we consider in the interpolation polynomial (2.6) more and more terms, and, at 
the same time, take the limit as Ax — 0, we obtain the famous formula 
(2.8) 

x — Xo)? x — Xo)? 
F(a) = Feo) + (e—a0) (ao) + LAH p(x) + LOH pag) .. 
All the series of the first chapter are special cases of this “series universalissima’’. 
For example, the function f(x) = In(1 + 2) has the derivatives 


f0)=0, f/(0)=1, f(O) = (-1)* *(k- 2)! 


and we obtain 


x x4 


x2 

Remarks. Formula (2.8) was believed to be generally true for more than a century. 
Cauchy then found an example of a function for which the series (2.8) converges, 
but not to f(a) (see Sect. III.7). There are also examples of functions for which 
the series (2.8) does not converge at all for x 4 0 (see Exercise III.7.6). A more 
satisfactory proof of (2.8) (due to Joh. Bernoulli) uses integral calculus and will 
be given in Sect. IL-4. 


Maclaurin’s Approach (Maclaurin 1742, p. 223-224, art. 255). For the function 
y = f(x) and a given point xo we look for a series (or polynomial) 


(2.9) p(x) = po + («@ — 20)q0 + (@ — 0)?ro + (a — 20)* 80 +--., 
for which 
(2.10) pO (ao) =f (ao) 4=0,1,2,..., 


i.e., both functions have the same derivatives up to a certain order at x = zo. Set- 
ting x = Zo in (2.9) yields po = p(ao) = f (xo) by (2.10). We then differentiate 
(2.9), again set x = xo, and obtain go = p’(%o) = f’(xo). Further differentiations 
give 2!rg = f(x), 3!89 = f’”’(xo), and so on. Therefore, the series (2.9) is 
identical to that of (2.8). 

Partial sums of the series (2.8) are called Taylor polynomials. 


Example. For the function given in (2.2) we 
choose the point 7 = 1 and have f (#9) = —3, 
f'(eo) = 2, f"(ao) = 4, and f’"(ao) = 6. 
Thus, the Taylor polynomials of degree 1, 2, 
and 3 become 
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Newton’s Method for Roots of Equations. The Taylor polynomials are an ex- 
tremely useful tool for the approximate computation of roots. We consider the 
example treated by Newton (1671), 


(2.11) xg? —2¢—-5=0. 


Trying out a few values of the function f(x) = 2? — 2x — 5, we find f(0) = —5, 
f(1) = -6, f(2) = -1, f(3) = 16. Hence, there is a root close to 7p = 2. The 
idea is now to replace the curve f(x) by its tangent line at the point xo, which is 
pi(x) = —1+ 10(a — 2). The root of pi (a) = 0, which is x = 2.1, is then an 
improved approximation to the root of (2.11). We now choose x9 = 2.1 and repeat 
the calculation. This gives pj(x) = 0.061 + 11.23(a — 2.1) and x = 2.0945681 
as new approximation of the root of (2.11). A further step yields x = 2.0945515, 
where all digits shown are correct (see in Fig. 2.6 a facsimile of the calculation 
done by Newton). 


+ 2,10000000 


ypmzy—5 =o 900544853 
+ 2,09455147 =y 


a+tp=y| +y3|+8 + 127 + 6p? + p} 
4 


=p) 3}-+ 0,001 + 0,039 + 0,3q* - g3 
aite=P) TP + 0,06 + 1,2 6 


+ 0,061 + 13,239 + 6,397 + g3 
—0,0054-+F =] + 6,3g2|+ 0,000183708 —0,06804r-+ 6,377 
+11,23g)— 0,060642 + 11,23 
+0,061 |+ 0,061 
Summa_ }+0,000541708+4 11,16196r+ 6,372 


——— 
0,00004854-+ s=7 


neglefto, & prodit 6,3r*-+ 11,161967 + 0,000541708 = o fere, five 

(rejefto 6,37°) r= —SSHTE = — 0,00004853 fere, guam fcribo in 
tiva parte Quotientis. Denique negativam partem Quotientis ab 

Affirmativa fubducens habeo 2,09455147 Quotientem qualitam. 


FIGURE 2.6. Newton’s calculation for «* — 2x — 5 = 0° 


Use of the second degree polynomial (E. Halley 1694). We choose for the above 
example the point x) = 2.1 and use two terms of the Taylor polynomial. This 
gives 

0.061 + 11.23(@ — 2.1) + 6.3(x — 2.1)? = 0, 


a quadratic equation in z = x — 2.1, which has two roots. We choose the one that 
is smaller in absolute value (i.e., for which zx is closer to 2.1) and obtain 


—11.23 + V11.232 — 4-0.061- 6.3 
12.6 


z=a-21= 


3 Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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hence, x = 2.0945515. Again, all digits shown are correct, obtained this time with 
only one iteration. 


Exercises 


2.1 


2.2 


2.3 


2.4 


2.5 


2.6 


247 


2.8 


(Euler 1755, 8261). Study the functions 
y= at — 8x? + 2207 — 24¢ +12, yaa? —5a*4+ 50941. 


Find maxima, minima, convex downward regions, inflection points. 
(Euler 1755, 8272). The sequence of numbers 


Vl=1, 72=1.4142, ¥3=1.4422, ¥4=1.4142, 7/5 = 1.3797,... 


suggests that the function y = %/z = x'/* possesses a maximum value close 
to x = 3. Where exactly? In which relation is this value with the minimum 
value of y = x”? 


(Joh. Bernoulli 1691/92). Find x such that 
the rectangle formed by the abscissa and 
the ordinate for a point on the circle y = 
vx — x? has maximal area. Verify the max- 
imality by computing the second derivative. 


x 1 


(Euler 1755, §272). Find x such that «sin x possesses a (local) maximum 
(you will find an equation that is best solved by Newton’s or Halley’s method; 
Euler gives the result ¢ = 116°14’2120'"35/""47'""; the correct value of the 
last digits is 3238”). 


Compute for the function y = x? the second difference 
A*y = (ag + 2Azr)*® — 2(a+ Az)? +2. 

Show that this difference, divided by Ax?, tends, for Ax — 0, to 62, the 

second derivative. 

Let f(x) = sin(x?). Compute f’(x), f’(x), f(z), f(x), ... to obtain 

the series of Taylor 

ot 


7 Perks 


x? fod 
f(z) = f(0) + f'(O)x 4 £0) 4 PO) a he (0) 


Is there a much better way of obtaining this result? 


Show that Newton’s method, applied to x? — 2 = 0, is identical to (1.2.13), 
the Babylonian computation of \/2. However, formula (1.2.14) is different 
from Halley’s method. Why? 

(Leibniz 1710). For a function y(a) = u(x) - u(x) show, by extending (1.4), 
that 


MN /f MN Ww mw Wad 1 mw 
y =uvt+2uv +uv, yo =u v+3uv 4+3uu +uv-. 


Find a general rule for y”. 
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II.3 Envelopes and Curvature 


My Brother, Professor at Basle, has taken this opportunity to investigate 
several curves that Nature sets before our eyes every day ... 

(Joh. Bernoulli 1692) 
I am quite convinced that there is hardly a geometer in the world who can 
be compared to you. (de L’Hospital 1695, letter to Joh. Bernoulli) 


Envelope of a Family of Straight Lines 


Inspired by a drawing of A. Diirer (1525, p. 38, see 

Fig. 3.1, right), we consider a point (a,0) moving 

on the x-axis and the point (0, 13 — a) moving on 

the y-axis in opposite direction. If we connect these 

points by a straight line 13-a 

a—13 132 | ai 


4) aig 
7 (e-a)=13+2-a = 


G1) y= 


we obtain an infinity of lines which are displayed in Fig. 3.1, and which create an 
interesting curve, called the envelope, which is tangent to each of these lines. The 
problem is to compute this curve. This kind of problems was extensively discussed 
between Leibniz (see Leibniz 1694a), Joh. Bernoulli and de L’ Hospital. 


)o EELS 


161F [$F 1H 1D § 76S hs 20 


FIGURE 3.1. Family of straight lines forming a parabola and a sketch by Diirer (1525)! 


Idea. We fix the variable x to an arbitrary value, say, x = 4, for which the family 
(3.1) becomes y = 17 — a — 52/a. We then observe that this value first increases 
for increasing a (see Fig.3.1; fora = 3,4,5,6 we have y = —3.33, 0, 1.6, 2.33 
respectively). During this time the point (4, y) approaches the envelope. The en- 
velope is finally reached precisely when this function attains its maximum value, 


' Reproduced with permission of Verlag Dr. Alfons Uhl, Nordlingen. 
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whence where the derivative y’ = —1+ 52/a? = 0, ie., for a = /52. This value 
is y = 17 — 2/52 = 2.58. 

The same idea works for any value of x: we have to compute the derivative 
of (3.1) with respect to a by considering x as a constant (“differentiare secundum 
a”). This is called the partial derivative with respect to a. At points of the envelope 
this derivative must vanish. Today we denote this as (see Sect. IV.3 below, see also 
Jacobi 1827, Oeuvres, vol. 3, p.65) 


O 
(3.2) = = 
For Eq. (3.1) this becomes Oy/Oa = —1 + 13a/a? and condition (3.2) gives 
a = \V/13x. We obtain the envelope by inserting this into (3.1), 
(3.3) y =x —2V132 +13 
or 
(3.4) (y— a — 13)? = 522. 


This is the equation of a conic, which, in our case, turns out to be a parabola. 


The Caustic of a Circle 


Problem. Let x? + y? = 1 bea circle (Fig. 3.2) and suppose that parallel vertical 
rays are reflected by this circle. This yields a new family of straight lines which 
apparently produce an interesting envelope. Find the equation of this envelope. 


FIGURE 3.2. The caustic of the circle (Joh. Bernoulli 1692) 


Joh. Bernoulli (1692) gives a solution “per vulgarem Geometriam Carte- 
sianam’’; on the other hand, in his “Lectiones” (Joh. Bernoulli 1691/92b, Lectio 
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FIGURE3.3. The reflected ray 


XXVII, “Caustica circularis radiorum parallelorum”, Opera, vol.3, p.467), he 
uses the “modern” differential calculus. 

Solution. For representing the family of reflected rays, we choose as parameter the 
angle a between the ray and the radius vector (see Fig. 3.3). After some elemen- 
tary geometry and from the fact that the reflected ray has slope tan(2a@ — 7/2) = 
— cos 2a/ sin 2a, we find the equation 


Be) Ur peeee. Cena: Secs 


i} cos 2a dj x ( sina cos a) 
cosa sina/’ 


As required by (3.2), the condition for the caustic is expressed by 


Oy sin a x 
3.6 —=- ——— = 0 
G8) Oa 2cos?a@  2cos? asin? a 
which gives 
(3.7) x = sin’ a. 


In order to obtain the equation of the caustic, we insert this into (3.5) and obtain 


1 1 sinta . » 
y=5(- — sin 


1 
acosa) = —cosa(5 + sin? ?, 
cosa cosa 2 


This is, together with (3.7), a parametric representation of the caustic. If we want 
y expressed by x, we insert sina = x!/3 and obtain 


(3.8) y=—Vi- eB (5 +27), 
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Envelope of Ballistic Curves 


Problem. A cannon shoots bullets with initial velocity v9 = 1 at all elevations. 
We wish to find the envelope of all ballistic parabolas (Fig. 3.4). This question, al- 
ready considered by E. Torricelli (De motu projectorum 1644), was among the first 
problems which fascinated the young Joh. Bernoulli (see Briefwechsel, p. 111). 


FIGURE 3.4a. Envelope of shooting para- 
bolas 1721, in “Peterhof”, St. Petersburg 


Solution. Let a be the slope of the cannon. Then the movement of the bullet (under 
a gravitational acceleration of g = 1) is given by 
t at e 
z(t) = —_, t) = —— _ - —.. 
ae rs ae 
Eliminating the parameter t = 2/1 + a?, we get 
x?(1 +a?) 
5 : 


Differentiation of (3.9) with respect to a gives 0y/Oa = «—ax? and the condition 
(3.2) leads to a = 1/z. Inserting this into (3.9), we obtain 


y= (1 — x) /2, 


so that the envelope is a parabola with the cannon at its focus. 


(3.9) y = ax — 


Curvature 


There are few Problems concerning Curves more elegant than this, or that 
give a greater Insight into their nature. 
(Newton 1671, Engl. pub. 1736, p. 59) 
Problem. For a given curve y = f(x) and a given point (a, f(a)) on this curve, 
we want to find the equation of a circle that approximates as well as possible 
the function f(z) in the neighborhood of a. This circle is then called the circle 
of curvature and its center is the center of curvature. The inverse of its radius is 
called the curvature of the curve at the point (a, f(a)). 
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Idea (Newton 1671). Let 
(3.10) Y= JG) 


be the normal to the curve y = f(x) at the point x = a. If we increase a (“imagine 
the point D to move in the curve an infinitely little distance”), we find a second 
normal that intersects the first one at the center of curvature (Fig. 3.5). 

The situation is identical to that of the envelopes (see Fig. 3.1b). Thus, we 
compute 


Oy f"(a) 1 
and conditon (3.2) yields for the center of curvature 
(3.12) 

/ 2 / / 2 
rena FLOP) pq) tore _ OF F'@)) 


f(a) 


f(a) f"(a) 


get 


(3.13) — ( 


FIGURE 3.5. Curvature, sketches by Newton 1671, (Meth. Fluxionum; French transl. 1740)" 
Example. For the parabola y = 2? we get r = (1 + 4a?)3/? /2, and the center of 
curvature is given by 


+ 407)24 ss 


1+4a7) 1 
yy =a? + PEM) _ 2 4 302 


(3.14) 2 =a- 5 


> Reproduced with permission of Editions Albert Blanchard, Paris. 
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These formulas form a parametric representation of the geometric locus (10, yo) 
of the centers of curvature. It is called the evolute. In the situation of Eq. (3.14), 
the parameter (here a) can be eliminated and we obtain (see Fig. 3.6b) 


2/3 
ne 1 ZO 


Fig. 3.6a illustrates the fact that the evolute is the envelope of the family of normals 
to the given curve. 


33 
2 
1 
| Li | 
—2 -1 1 2 
FIGURE 3.6a. Evolute = envelope of the FIGURE3.6b. Parabola y = ” and its 
normals evolute 


Curvature of a Curve in Parametric Representation. Consider a curve given by 
(a(t), y(£)) and suppose that close to the point (x(a), y(a)) it can be represented 
as y = f(x). Then, we have by (1.20) 

_ dy _ dy/dt _ y'(t) 


i er Pl dx/dt x(t)’ 


and for the second derivative 


ae d (4) d (42) dx _ x'(t)y"(t) — ey") 


~ dx\de) dt\x'(t)// dt x(t) 


dx) dt 
Inserted into Eqs. (3.12) and (3.13), we get 


(3.15) 0 (a) = z(a)y”(a) — a!"(a)y'(a)’ 

(3.16) Yo y( ) — x! (a)y"(a) <a x" (a)y'(a)’ 
(a? +y'@?)" 

(3.17) a |’ (a)y" (a) = x" (a)y’(a)| 
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Por prt Oe 


| 
N 


FIGURE3.7. Cycloid and its evolute 


mais aussy dans les 
la roulette premiere 
donc la roulette A 
le diametre AE =: 
grandeur de AD, « 


FIGURE3.8. A cycloid drawn by Joh. Bernoulli (1955, p. 254, letter of Jan. 12, 1695 to de 
L’ Hospital)? 


Example. The cycloid (trajectory of the valve of the wheel of a bike) is given by 
the parametric representation 


(3.18) x=t-—sint, y=1-cost. 


Computing its derivatives, we obtain from Eqs. (3.15) through (3.17) that the evo- 
lute of the cycloid is given by 


(3.19) % =a+sina, yo = —1+ cosa. 
This is a cycloid again, in a different position. 


Involutes. We now start from a given evolute ABB (see Fig. 3.6b) and construct a 
new curve CC defined by the property that the arc length ABC is constant (imagine 
a string unwinding from the evolute). These new curves are called involutes. If one 
point of the involute coincides with the original function f(x), both curves will 
have the same curvature. It then follows (to be proved rigorously by the ideas of 
Sect. III.6) that both curves are identical. Hence, not only the evolute, but also the 
involute of the cycloid (with the correct choice of the arc length) is again a cycloid 
(Newton 1671, Prob. V, Nr. 34). Huygens (1673) used this property to construct the 
best pendulum-clocks of his century, based on the fact that a pendulum following 
a cycloid is isochronous (see Fig. 7.8 of Sect. II.7). 


3 Reproduced with permission of Birkhaeuser Verlag, Basel. 
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Exercises 


3.1 A bar of length 1 glides along a vertical wall (see Fig. 3.9a). Find a formula 


for the created envelope. 


3.2 Find a formula for the envelope (see Fig. 3.9b) created by the family 


evolute 


3.3 


3.4 


3.5 


FIGURE 3.9. Evolutes and envelopes 


(Cauchy 1824). Find the envelope created by the family of parabolas 


y = b(a +b)? 


with parameter b (see Fig. 3.9c). 


Compute for the function y = In the radius of curvature at the point a and 
determine a for which this radius is minimal (see Fig. 3.9d). It can be seen 
that the evolute has a stationary point (a cusp) at this minimal position. 


Compute the evolute of the ellipse (see Fig. 3.9e) 


x=acost 
az be y = bsint 


Determine the maximal and minimal curvature. 
2 


b 3 Oe ee 
Result. z=(a-—) cos’ t , y= (6- —) sin t. 


a 


106 IL. Differential and Integral Calculus 


3.6 Compute the radius of curvature of the catenary y = (e” + e~”)/2. Show 
that this radius for a given point M on the curve is equal to the length of the 
normal MN (see Fig. 3.9f). 

3.7 One observes in Fig. 3.7 that a spoke of a rolling wheel creates an envelope 
that resembles a half-sized mini cycloid. This becomes more visible when 
the entire diameter is drawn (Fig. 3.10). Compute the envelope of this family 
of straight lines 


2 SSS 
Kx YY 


Ce 
FM 
3 4 5 6 7 


pS Se 


=I 07 1 


ee! 


& 


Pepe <2 


Guillaume-Frangois-Antoine de L’ Hospital ( 1661-1704)? 
Johann Bernoulli (1667-1748)* Marquis de Sainte-Mesme et du Montellier 
Compte d’ Autremonts, Seigneur d’Ouques et autres lieux 


4 Reproduced with permission of Georg Olms Verlag, Hildesheim. 
> Reproduced with permission of Birkhaeuser Verlag, Basel. 


II.4 Integral Calculus 107 
1.4 Integral Calculus 


...notam f pro summis, ut adhibetur nota d pro differentiis .. . 
(Letter of Leibniz to Joh. Bernoulli, March 8/18, 1696) 


... quod autem... vocabulum inte gralis etiamnum usurpaverim ... 
(Letter of Joh. Bernoulli to Leibniz, April 7, 1696) 


And whereas M! Leibnits prefixes the letter f to the Ordinate of a curve 
to denote the Summ of the Ordinates or area of the Curve, I did some years 
before represent the same thing by inscribing the Ordinate in a square .... 
My symbols therefore . . . are the oldest in the kind. 
(Newton, letter to Keill, April 20, 1714) 
The integral calculus is, in fact, much older than the differential calculus, because 
the computation of areas, surfaces, and volumes occupied the greatest mathemati- 
cians since antiquity: Archimedes, Kepler, Cavalieri, Viviani, Fermat (see The- 
orem I.3.2), Gregory St. Vincent, Guldin, Gregory, Barrow. The decisive break- 
through came when Newton, Leibniz, and Joh. Bernoulli discovered indepen- 
dently that integration is the inverse operation of differentiation, thus reducing 
all efforts of the above researchers to a couple of differentiation rules. The inte- 
gral sign is due to Leibniz (1686), the term “integral” is due to Joh. Bernoulli and 
was published by his brother Jac. Bernoulli (1690). 


Primitives 


For a given function y = f(a) we want to compute the area between the x-axis 
and the graph of this function. We fix a point a and denote by z = F(x) the area 
under f(x) between a and x (Fig. 4.1a). The crucial fact is then that 


(4.1) the function f (a) is the derivative of F(x). 
We then call F'(x) a primitive of f(x). 


aX1%Q... In=b 
FIGURE 4. la. Newton’s idea FIG. 4.1b. Leibniz’s idea FIG. 4.1c. Sketch by Newton! 


' Reproduced with permission of Editions Albert Blanchard, Paris. 
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Justification. Newton imagines that the segment BD moves over the area under 
consideration (“And conceive these Areas ... to be generated by the lines BE and 
BD, as they move along ...”, Figs. 4.1a, 4.1c); consequently, if x increases by Ax, 
the area increases by Az = F(x + Ax) — F(x) which, neglecting higher order 
terms of Ax, is f(a) Aa (the dark rectangle of Fig. 4.1a). In the limit Ax — 0, 
we thus have 


(4.2) dz = f(x)-dx and — = f(x). 


Leibniz imagines the area as being a sum (later: “integral”) of small rectan- 
gles (Fig. 4.1b): 


(4.3) Zn = f (a1) Avy + f(xg) Ave +...4+ f(an) Atn. 
This implies that 

en — 2n-1 = f(&n) Arn, 
and we again get (4.2) when Ax; — 0. Consequently, the derivative is the in- 
verse operation of the integral, much as the difference is the inverse operation to 
addition. 


After long attempts, Leibniz symbolizes the sum in (4.3) (for the limit 
Ax; — 0) by (see Fig. 4.2) 


(4.4) / f(a) de. 


Nowadays, this area between the bounds a and b is denoted by (Fourier 1822) 


b 
(4.5) / f(a) da, 


whereas (4.4), the “indefinite integral”, stands for an arbitrary primitive F(a) of 


f(x). 


Sed exiis que in 
methodo tangentium expofui, patet efle d, 4 xx=xdx; ergo contra } 
sxxfxdx (ut enim poteftates & radicesin vulgaribus calculis, fic nos 
bis fumme & differentia feu {& d, reciprocz funt.) 


FIGURE 4.2. First publication of the integral sign, an old-style “s” (Leibniz 1686) 


Primitives are not unique; to each primitive F'(x) one can add an arbitrary 
constant C' and F(x) + C is again a primitive of the same function. For C = 
—F(a) we obtain the primitive F(x) — F(a), which vanishes for z = a (as does 
also the area z). Therefore, the area between a and b is 


> Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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b 
(4.6) ‘i f(a) da = F(b) — F(a) 


(see the “Fundamental Theorem of Differential Calculus” in Sect. III.6). 

By reversing differentiation formulas we obtain formulas for primitives. For 
example, the function f(x) = "+1 has f’(x) = (n + 1)” as derivative. There- 
fore x"*! /(n + 1) is a primitive of x”. This and other formulas of Sect. II.1 are 
collected in Table 4.1. 


TABLE4.1. A short table of primitives 


gite* 1 
[erac= +C (n#-1) [qie=ineee 
n x 


fea-erc 


[suede = cose + [cosa =sinx +C 
: : d ie +C ‘ : d inz+C 
v = arctan x ————. dz = arcsinx 
is V1—x? 


Large tables of primitives can be many hundreds of pages long. We mention 
the tables of Grébner & Hofreiter (1949) and Gradshteyn & Ryzhik (1980). In 
recent years this knowledge has been incorporated into many symbolic computer 
systems. 


Applications 


Area of Parabolas. The area under the nth degree parabola y = x” between a 
and b becomes by (4.6) and Table 4.1 


b n+l 1b prti _ n+1 
x a 
( ) | * . n+ 1 a n+ 1 


9 


where we have used the notation F(a)|° = F(b) — F(a). For a = 0 this formula 
is Fermat’s Theorem 1.3.2. 


Area of a Disc. To compute the area of a quarter of a disc we consider the function 
f(a) = V1 — a? for0 < « < 1.A primitive of f(z) is 


1 
(4.8) F(a) = 5 12? + Sarcsine. 


This can be checked by differentiating (4.8). Later we shall see how such formulas 
are actually found. Applying (4.6), we thus get 


1 
area of unit disc = if V1— 2? dx = 4(F(1) — F(0)) =z, 
0 


since sin(7/2) = 1. 
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There is another elegant way of computing the area of a disc. Nothing forces 
us to assume that f(a) dx are slices of small vertical rectangles. Let us cut the disc 
(of radius a) into infinitely thin triangles (Kepler 1615, see as well Leibniz’s idea, 
Fig. 1.4.11). The area of such a triangle is 


a? - dy 
9) ? 
where dy is the infinitely small increment of 
the angle. The whole area is (sum of all these 
triangles) 
= ie az dy a2 20 a2 Qn ; 
0 


2, aaa ae Heal = et 


dS = 


Volume of the Sphere. Consider a sphere of radius a (see Fig. 4.3) and let us cut 
it into thin slices (discs of thickness dz and of radius r = a? — x? ). The volume 
of sucha slice is dV = r?x dx = (a? — x?)m dz and for the total volume of the 


sphere we get 


V= (a? — x”)adx = (20? - =) 


Ny S 
UP SN 
NS 


(a a x?) 1/2 


FIGURE4.3. Volume of a sphere 


Work in a Force Field. Suppose that a force f(s) acts in the direction of a straight 
line parameterized by s. The work in moving a body from s to s + As is equal to 
f(s)As (force x length). Therefore, the total work is He f(s) ds. 


Example. The gravitational force of the earth on a mass of 1kg is f(s) = 
9.81 - R?/s? [N] , if s is the distance to the center. Hence, the energy in moving 
1 kg from the surface to infinity is given by 
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co R?2 R?2 oO 
= | 9.81— ds = —9.81—|_ = 9.81 R = 62.10° [J]. 
R Ss S IR 


Arc Length. 
The fluxion of the Length is determin’d by putting it equal to the square- 
root of the sum of the squares of the fluxion of the Absciss and of the 
Ordinate. (Newton 1736, Fluxions, p. 130) 
We wish to compute the length L of a given curve y(x), a < x < b. If we increase 
x by Ax (see Fig. 4.4), the ordinate is increased by Ay = y’(x) Ax (we neglect 
higher order terms). Therefore, the length of a small part of the curve is given by 
As, where 


As? = Ax? + Ay? = (14+ y'(x)*) Ax? 


(theorem of Pythagoras). For the limit Az — 0 we obtain 


b 
(4.9) ds = V/1+y'(x)? - dx and L= / V1it+y' (x)? da. 


‘i / 
Pas dy e ; 7 dy 


4 
X do 


05 c 


FIGURE 4.4. Arc length of y = 2” 


Example. For the parabola y = x? we have y’ = 22 and the length of the arc 
between x = 0 and x = 1 is given by (see (4.27) below) 


1 
1 1 1 
p= f V1 + 4x? de = 52 1+ 42? + > In(2e+ V4? +1)| 
0 0 
5 1 
-% 4 hina+ v5), 


Center of Mass. Consider, for example, two masses m1, ™2 placed at the points 
with abscissas #1, 22. The moment applied at the origin is m121 + m2%2. The 
center of mass Z% is the point where both masses, concentrated, would produce the 
same moment, i.e., 


(4.10) (m4 + mg) -F=M12X1+ MX. 
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If the density of a body varies continuously in such a manner that a slice of thick- 
ness dx has the mass m(2) dx, we have, by analogy with (4.10), 


He xm(a) dx . 
f? m(2) dx 


Example. For a triangle formed by the straight line y = cx, 0 < x < a, we have 


b b 
(4.11) [ m@ac-z= | xm(a) dx and r= 


fy cx de’ 08/3 2a 
4.12 = r= ———— = ==. 
( ) m(x) CL, x i en de a2/2 3 


Remark. For a random variable X with “density function” f(a) (which satisfies 
Jo, f(x) dx = 1), the value T= f°. x f(x) dz is the average of X. 
Integration Techniques 


We shall now explain some general techniques for finding a primitive. A sys- 
tematic approach for some important classes of functions will be presented in 
Sect. II.5. 

A first observation is that integration is a linear operation, i.e., 


(4.13) [lane + co fo(x)) da = c1 i: fila) dx + e2 / f(x) dx. 
This follows at once from the fact that differentiation is linear (see (1.3)). 
Substitution of a New Variable. Suppose that 

F(z) is a primitive of f(z), 


ie., F(z) = f(z), and consider the substitution z = g(x), which transforms the 
variable z into x. It then follows from (1.16) that 


F'(g(x)) is a primitive of f (g(x))g' (2). 


Consequently, we have 


b g(b) 
i. t(g(2))q'(a) de = / f(e) dz, 


g(a) 


(4.14) 


because, by (4.6), both terms are equal to F'(g(b)) — F'(g(a)). The expression to 
the left is obtained by substituting z = g(x) in f(z) and dz = g'(x)dz. 


Geometric Interpretation. We want to compute 


1.5 
4 
i Z dx 
9 14+2? 


and use the substitution z = x?. Since dz = 22 dz, we obtain from Eq. (4.14) 
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u =2/(1+2z) 


1 
same are 


AS 


1 2 


FIGURE4-.5. Substitution of a variable in an integral 


1.5 2 2.25 2 2.25 
i; ar ede = | dz =2-In(1+4 z) 
9 1l+a? 9 l+2z 0 


1.5 
ola’ 2”)| = 21n(3.25). 
0 


Fig. 4.5 illustrates the transformation z = x? and the functions 42°/(1 + x”) and 
2/(1+ 2). Points x and + Az are mapped to z = x? and z+ Az = 27+2rAr+ 
Azx?. Therefore, the shaded rectangles have, for Ax — 0, the same areas, and both 
integrals in (4.14) give the same value. 


Examples. All the art consists in finding a “good” substitution. This will be 
demonstrated in a series of examples. 

For functions of the form f(ax + b) the substitution z = ax + b is often 
useful. For example, with z = 5x + 2, dz = 5dz, we have 


dz 1 1 
(4.15) jen dx = Je = = pea eet, 


Sometimes the presence of the factor g’(a) for the substitution z = g(a) 
can easily be recognized. For example, in the integral below the factor x suggests 
using z = —2?, dz = —2x dz and we obtain 


1 1 1 
(4.16) | ie dv =—5 f ede=- Ze =—Fe™, 


From Table 4.1 we obtain the integrals of 1/(1 + x?) or 1/1 — 2?. If we 
want to find a primitive for, say, 1/(7+ x7) or 1/7 — x? we use the substitution 
x? = 727 orz = V7 z, dx = V7 dz. This yields 


(4.17) ' ae (a : arcta : arcta 
‘ — = — arctan z = —= arctan —. 
7+ 22 71+22) V7 V7 v7 
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Quadratic expressions x? + 2bx + c are often simplified by restoring a com- 
plete square as (2 + b)? + (c — 6?) followed by the substitution z = x + b. In 
this way the following integral is reduced, by the substitution z = « + 1/2, to the 
integral in (4.17): 


(4.18) 
dx / dz 2 pala 2z 2 tee (= + *) 
——— ———_ = — alr n—_— > —ar n y 
e+aet+l1 2243/4 V3 WSS afB J/3 


As a last example, we consider the function (x + 2)/(x? + 2 + 1). Here we 
write (Euler 1768 § 62) the numerator as x + 2 = (a + 1/2) +3/2 so that the first 
part z + 1/2 is a scalar multiple of the derivative of the denominator. This part of 
the integral is then computed with the substitution z = x2? + x + 1. The second 
part is a multiple of (4.18), and we obtain 


u+2 1 2a+1 
4.1 a dy = = In(2? 1 ( ). 
(4.19) I x 5 Ina +a+1)+ V3arctan a 


nN 


Integration by Parts. A second integration technique is based on the differen- 
tiation rule for products (1.4). Integrating the formula (uv)! = u’v + uv’ gives 
u(x)u(x) = f (u'(x)v(x) + u(x)v'(x)) dx, or equivalently 


(4.20) i w'(x)v(x) dx = u(x)v(x) — / u(a)u' (a) de. 


In this formula, one integral is replaced by another. However, if the factors u’ and 
v are properly chosen, the second integral can be easier to evaluate than the first 
one. 


Examples. Let us try to compute [ x sin x dz. It would be no use choosing wu! (x) = 
x (u(x) = x*/2) and v(x) = sin x because then the second integral would be even 
more difficult to evaluate. Therefore, we choose u’(x) = sin x (u(x) = — cos x) 
and u(a) = x. Equation (4.20) then gives 


(4.21) [vsincae = —xcosx + / 1-cosadx = —xcosx+sinz. 

Sometimes it is necessary to repeat the integration by parts. In the following 
example, we first put u(a) = x?, u/(x) = e”, and for the second integration by 
parts we put v(a) = x, u(x) = e*: 


(4.22) [eerdeaaet 2 f rede = ea? — 20 +2). 


Functions such as Inx or arctan have simple derivatives. They will be 
frequently used in the role of v(): 


(4.23) oscars [iemerde=aine— [2dr = o(tne~2), 
x 
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x 
J exetanz dx = xarctanx — , —— dz 
1+ 2? 
(4.24) 


1 
= garctanz — 5 In(1 + 2”). 


Here, the last integral is evaluated with the substitution z = 1 + x’, dz = 2axdxz. 
Consider next the integral [ V1 + 42? dx, which we encountered in the 
computation of the parabola’s arc length. Integration by parts with u’(#) = 1, 


v(a) = V1 +4 42? yields 

Ay? 
4.25 VIFEP de = aya — f Sat, 
( ) / Vv1+ 42? 


Here, the second integral does not look much better than the first one. However, 
the numerator can be written as 4x7 = (1 + 4x7) — 1. The integral can then 
be split into two parts, one of which is — f 1+ 4x? dx (the integral we are 
looking for) and can be transferred to the left side; the other resembles the last 
integral of Table 4.1: the derivative of arsinh z is 1/1 + z? and we have, with 
the substitution z = 2a (see Exercise I.4.3), 


dx 1 1 
———. = ~arsinh (27) = — In(2e + /4a2 + d) 
| ye omen) = 5 v 
This gives, for (4.25), 


1 1 
(4.27) [Vira ae = 52 1+ 42? + = In(2x+ Jaa? +1). 


Recurrence Relations. Suppose we want to compute 


(4.26) 


(4.28) Ls pow xdx. 
We put u/(x) = sin, v(x) = sin"! x and apply integration by parts. This yields 
pow zdx =—cosxrsin"—' z+ (n—1) [cos gsin”~? 2 dz. 


We insert cos? « = 1 — sin? x and the right integral can be split into the two 
integrals J,,-2 and J,,. Putting J,, on the left side, we obtain (1+ n—1)I, = 
—cosxsin”! a + (n — 1)In~2, or 


1 Aa n—-1 
(4.29) I, = ——cos2x sin x + —TIpn_2. 
n n 


This recurrence relation can be used to reduce the computation of J, to that of 
I, = f sine dz = — cos z (if nis odd), or to that of Ip = { dx = x (if n is even). 
As a further example, consider the integral 


dx 
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In the absence of a better idea, let us apply integration by parts with u/(x) = 1 
and u(x) = 1/(1+ 27)": 


1 £ Qx? 
nn 1: SS ———s ——__——__ ~. 
: / G+a2)r cat | ae . 


Using the same trick as in (4.25), we write in the last integral 277 = 2(1+.2?)—2 
and obtain . 
In = ——— + 210d n — 2NIn41- 
(Tf atye Tn An 
We are unlucky because the index n, instead of becoming smaller, became larger. 


But this is of no importance: we reverse the formula and get 
1 x 2n—1 

4.31 Ing) = — 4+ 

eh) wh OR (1 + a?)” es 2n 


This relation reduces the computation of (4.30) to that of J; = arctan x. 


Taylor’s Formula with Remainder 


Joh. Bernoulli (1694b, “Effectiones omnium quadraturam ...”) computed inte- 
grals by repeated integration by parts and obtained “generalissimam” series simi- 
lar to those found later by Taylor. Cauchy (1821) then discovered that this method, 
cleverly modified, leads precisely to Taylor’s series of a function f with the error 
term expressed by an integral. 

The idea is to write (see (4.6)) 


pla) = sa) + f “1. fat 


and to apply integration by parts with u(t) = 1 and u(t) = f’(t). The crucial fact 
is that we put u(t) = —(x — t) (x is a constant) instead of u(t) = t. We thus get 


f(@) = f(a) — (@- tf) 


In the next step, we put u(t) = —(a — t)?/2! and v(t) = f’(t) to obtain 


Fla) = fla) + (ea) f(a) + 2S prays [PSO pra 


Continuing this procedure, we arrive at the desired result: 


BRN v (y — t)k 
as fe) => ) f(a) + f ( am fOFD (t) dt. 
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Example. For f(a) = e”, f(x) = e®, and a = 0 Eq. (4.32) becomes 
w x? *(a@-t)* 


You might now be astonished at seeing the error . the series expressed by an in- 
tegral, after having had all these difficulties in evaluating such integrals. If the in- 
tegral in (4.33) is computed by the above skillful methods, one obtains, of course, 
simply e* — Shs x’ /i!, which will be of no help at all. The idea is to replace 
the integrand in (4.33) by something simpler. For example, if we suppose that 
0 < x < 1, then0 < t < 1 too, and the function e! lies between the bounds 1 
and 3. It therefore appears convincing (this will later be Theorem III.5.14) that the 
corresponding area will also lie between 


x _ 4\k k+1 x _ 4\k k+1 
[Spo aa aad [ Ge sa- Fe 
0. Cf! (k +1)! aa (k+1)! 


This allows the conclusion that, say, for k = 10 the error is smaller than 1077, 


Exercises 


4.1 Let acurve be given in parametric representation x(t), y(t). Show that its arc 
length fora < t < bis 


b 
i= | Jal? + yb dt. 


Compute the arc length of the cycloid (3.18) forO0 < t < 27. 
4.2 Compute the integrals 


x dx dx 
a b ———., c x’ sin x da, 
= 7 d ii 
d. 
d) I_» ee : . 4» NE) fee dx, f) [ exccose de. 
sin” x 


a dx 
ax , h ar as 
g) he cos Bxdz, h) Je sin@xdz, i) [=a 


Hints. For (d) reverse Eq. (4.29), for (e) write 2? = x - x”, for (g) and (h) 
do either integration by parts or decompose [ e(¢+48)© dy into its real and 
imaginary parts. 


4.3 Show by repeated integration by parts that for integer values m and n 
b 
b _ m = n b _ m+n+1 
(4.34) i cet Ce LS eg aes a 
e m! n! (m+n+1)! 


in particular 
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II.5 Functions with Elementary Integral 


The above quantity 
ppads 
qqss — ppaa 
reduces immediately, without any change, to two logarithmical fractions, 
by separating it thus: 
ppads 7 i pds pds 
qqss — ppaa 7 qs — pa 7 qs +pa 
(Annex to a letter of Joh. Bernoulli 1699, see Briefwechsel, vol. 1, p.212) 
Problem 3: If X denotes an arbitrary rational function of x, describe a 
method by which the expression X dx can be integrated. 
(Euler 1768, Opera Omnia, vol. XI, p. 28) 
In the preceding section, we learned some techniques of integration. Here, we will 
use these techniques systematically in order to establish the fact that the integrals 
of several classes of functions are elementary. Elementary functions are functions 
composed of polynomials, rational, exponential, logarithmic, trigonometric, and 
inverse trigonometric functions. 


Integration of Rational Functions 


Let R(x) = P(x) /Q(«) bea rational function (P(x) and Q(x) polynomials). We 
shall present a constructive proof of the fact that [ R(x) dx is elementary. The 
computation of a primitive will be carried out in three steps: 

— reduction to the case deg P < deg Q (deg P denotes the degree of P()); 

— factorization of Q(x) and decomposition of R(x) into partial fractions; and 

— integration of the partial fractions. 


Reduction to the Case deg P < deg Q. A first simplification of the function 
R(x) can be achieved if deg P > deg Q. In this situation, we divide P by Q and 
obtain 


BO 2954 2S) 
(5.1) Oa )+ OG’ 


where S(a) and P(x) are polynomials (quotient and remainder) with deg P < 
deg Q. As an example, consider 

P(x) _ 22° — 32° — 9x4 + 2323 + 2? — 442 + 39 

Q(x) x? + 24 — 5a3 — 4? + 84-4 ; 


(5.2) 


We first remove the term 22° by subtracting 27Q(ax) from P(x), then we add 
5Q(a) to P(a) and arrive at 


P(x) _ pe ce 6x* — 20x? + 4x + 19 


Pe) Q(x) x + a4 —5a3 — 27? + 84 — 4° 


The polynomial S(a) is readily integrated so that only the second term in (5.1) 
requires further investigation. 
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Decomposition into Partial Fractions. We assume that a factorization of Q(z) 
into linear terms is known: 


k 


(5.4) Q(x) = (e@—a1)™ (@ — a9) -...- (@ — aR) = [[@ Su 


i=l 


Here, a1,..., a are the (possibly complex) distinct roots of Q(x) and the m, are 
their corresponding multiplicities. The following lemma shows how our rational 
function can be written as a linear combination of simple fractions, so-called par- 
tial fractions. This idea goes back to the correspondence between Joh. Bernoulli 
and Leibniz (around 1700), and was systematically exploited by Joh. Bernoulli 
(1702), Leibniz (1702), Euler (1768, Caput I, Problema 3), and Hermite (1873). 


(5.1) Lemma. Let Q(x) be given by (5.4) and let P(x) be a polynomial satisfying 
deg P < deg Q. Then there exist constants C;; such that 


P(t) Oe _ Gy 
a Om) ~ 442, a 


Proof. We eliminate one factor of Q(x) after another as follows: we write Q(x) = 
(x—a)'™ q(x), where a is a root of Q(x) and g(a) 4 0. We will show the existence 
of a constant C' and of a polynomial p(x) of degree < deg Q — 1 such that 


P(x C x 
(5.6) SO 5 
(x@—a)™q(z)  (e@-a)™ — (a a)™*q(x) 
or equivalently (multiply by the common denominator), 
(5.7) P(x) =C'- q(x) + p(x): (a@— a). 
By putting x = a, this formula motivates the choice 


(5.8) C = P(a)/q(a). 


The polynomial p(x) is obtained from a division of P(x) — C’ - q(x) by the factor 
(a — a). The same procedure is then recursively applied to the right expression of 
(5.6) and we obtain the desired decomposition (5.5). 


Example. The polynomial Q(x) of (5.2) has the factorization 

(5.9) Q(x) =2° +24 —52° — 2724+ 82 —4= (¢-1)3(e +2)’. 

Applying (5.7) and (5.8) with a = —2 and m = 2, we obtain, for (5.6), 
6x4 — 202? +4¢4+19 —-1 6x3 — 112? -2x+9 


@—-18@+22  (@+22) @-l@+2) 


A second application with a = —2 and m = 1 gives 
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6x4 — 202? +4¢4+19  -1 3 3a? — 84 +6 


oy) @—l8@+2? (+22 e+2 (1 


In the last expression, we replace x = (2 — 1) +1 so that 32? — 8% + 6 = 
3(a — 1)? — 2(a — 1) +1, and (5.10) becomes, finally, 


(5.11) 
6x4 — 20a2 +40 +19 _ 1 £ —2 i 3 i —l n 3 
(g—1)8(@ +2)? = (@=1)8 “ (a@-1)? w@-1 (@ +2)?  2+2 


Second Possibility. By Lemma 5.1, we know that 


(5.12) 

6x4 a 20x? + 4a +19 Ao Ay Ag Bo By 

eg eng Ss Se tt 
(a — 1)3(a + 2)? (e-1)8  (@-1)? «2-1 (+2)? £+2 


The coefficients A; and B; can be computed as follows: we multiply Eq. (5.12) by 
(x — 1)3 so that 


6a* — 20x? + 4x + 19 


(e +22 = Ap + Ar(x — 1) + Ao(a — 1)? + (2 — 1)%9(z), 


with some function g(a) well defined in a neighborhood of « = 1. Hence, the A; 
are the first coefficients of the Taylor series of P(a) /(x + 2)? (see Sect. II.2) and 
satisfy 


ate 1d — 
“ib dat (a + 2)? 


ie., Ag = 1, Ay = —2, Ap = 3. Ina similar way, we get 


ae d’ Ss 
‘il daxt (a — 1)8 


e=—2 
1.€., Bo = —1, By, = 3. 


Integration of Partial Fractions. The individual terms in the decomposition (5.5) 
can easily be integrated by using the formulas of Sect. II.4 (see Table 4.1): 


—1 
d ————_———— ifj>1l 

(5.13) << =) G-D@-art * 

< In(a — a) if j =1. 
Combining Eqs. (5.3), (5.9), and (5.11), we thus obtain, for our example, 

P(x) 4 1 2 1 
dx = x —5a——_~ + —— 43 In(a—1) + —— 431 2)+C. 

lao apo Josie ea1° ae. eae mare 


If all roots of @(x) are real (i.e., the a; of (5.4) are real) then the C;; in 
(5.5) are real and we have expressed the integral as a linear combination of real 
functions. But nothing prevents us from applying the above reduction process also 
in the case where Q(x) has complex roots. 
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Example with Complex Roots. Suppose we want to compute {(1 + 24)~!dz. 
Since the roots of 4 + 1 = 0 are ay = (14+ 1)/V2, ag = (1 — i)/V2, a3 = 
(—1+%)/V2, a4 = (—1—1)/\2, the decomposition of Lemma 5.1 leads to 


MY 9 a A is B 
l+a* g¢—(14+i/V2 x-(1-i)/Vv2 0, OL, 
(5.14) ee 
e+(1—i)/V2  24+(14+i)/V2 -1 0 1 
O4 O, 
By (5.8), we get 
ip ee 
(a1 — a2)(a1 — a3)(a1 — 4) i/2- /2-(V2+4+iv2) 8 


and similarly B = (—1 + 1)/2/8, C = (1 — i) V2/8, D = (1+ 7)V2/8. Hence, 


[oa 2¥in@e= (ea) /Va SBine= 4 S52) 
+ Cln(a + (1 —2)/V2) + Din(a + (1+ 1)/V2). 


Using (1.5.11) and the relation 


(5.15) 


m/2  ifu>0 


arctan u + arctan(1/u) = 3 ifu<o 


which follows from (1.4.5) or from (1.4.32), we have 


fe re 8) = 5 (0)? + BP) +4 arctan 2%, 


and the right-hand side of expression (5.15) can be written as 


(5.16) 
dx V2. #4+vV2r4+1 V2 
/ a ca I = 3. In meyer (arctan(7V2+1)+aretan(zv2—1)) ‘ 


Avoiding Complex Arithmetic. Whenever complex arithmetic is not desired, we 
can proceed as follows: suppose that the polynomial Q() has / distinct complex 
conjugate pairs of roots aj £73),..., a, +7; and k distinct real roots 71,..., Vx. 
Then, we have the real factorization 


l k 


(5.17) Q(x) = |] ((e — a4)? + 67)" [[(@-w)™, 


i=l i=l 


where m,; and n; denote the multiplicities of the roots. A real version of Lemma 5.1 
is then as follows: 


122 Il. Differential and Integral Calculus 


(5.2) Lemma. Let Q(x) be given by (5.17) and let P(x) be a polynomial with real 
coefficients satisfying deg P < deg Q. Then, there exist real constants Aj;, Bij, 
and Cy; such that 


Py a Ai + Bix bt Ci 
CW De 22 +) Gay 


i=1 j=1 ((x 0) a 2)’ i=1 j=1 


Proof. The real roots can be treated as in the proof of Lemma 5.1. For the treatment 
of the complex roots we write Q(x) = ((@ — a)? + 6?)'"q(x), where a + i is 
a root of Q(x) and g(a + 13) # 0. Then, there exist real constants A, B and a 
polynomial p(x) of degree < deg Q — 2 such that 


P(c) A+ Bx p(x) 


((@— a)? +87)"q(2) (ea)? + 62)™ ¥ (7 — a)? + 62)" q(x) 
To see this, we consider the equivalent equation 
P(x) = (A+ Bz) -q(x) + p(x) - ((x — a)? + 6"). 


By putting « = a + 1, this formula yields A and B, and the polynomial p(x) is 
obtained from a division of P(x) — (A+ Bz) - q(x) by the factor ((a—a)? + 7). 
As in the proof of Lemma 5.1, the formula (5.18) is then obtained by induction on 
the degree of Q(z). 


For the integration of the general term of (5.18) we write it as 
A+ Bu = B(x —a) A+ Ba 
((@—a)? +6)’ ((e@-a)? +)’ ((@- a)? +6)’ 
The first term of this sum can immediately be integrated with the help of the 
substitution z = (2 — a)? + 67, dz = 2(a — a)dz. For the second term we use 


the substitution z = (x — a)/(G and obtain the integral (4.30) of Sect. II.4. Hence, 
for 7 = 1 we have 


/aere = + In((w 7 a)? + 6?) oe eee arctan(—*). 
and for7 > 1 
| ified, —— oe SE AA BO Cc) 
(@ —a)?+ g)? aj — 1)((a —a)2+ gy pant Ji\ GB) 


where J;(z) = arctan z and 
z 23-1 
SO fede pag. ae pia 
2) (2? + 1)4 2j 
Example. For the function of Eq. (5.14), Lemma 5.2 gives the decomposition 
1 1 A+ Bz C+ Dx 


l+at (a? + /2x + 1)(x2 — 2x + 1) gt 4 /2r +1 e2 — f2r+1 


(5.19) J541(2) J;(z). 
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Multiplication of this relation by (x? +./2x+1) and insertion of x = (—1+7)//2 
yields 


+1 (—1 +12) 
Ses SS A BO 
24 2% 4 7 V2” 
and A = 1/2, B = \/2/4 is obtained by comparing real and imaginary parts of 
this relation. The constants C = 1/2 and D = —\/2/4 are obtained analogously. 
Using the above formulas we get (5.16) again. 


Remark. Decomposition into partial fractions renewed the interest of the mathe- 
maticians of the 18th century for the roots of polynomials and for algebra. 


Useful Substitutions 


We now exploit the above result and present several substitutions that lead to fur- 
ther classes of functions whose indefinite integrals are elementary functions. In 
the rest of this section, denotes a rational function with one, two, or three argu- 
ments. 


Integrals of the Form { R( Vaz + b, x) da. An obvious substitution is 


a _b 
(5.20) Var +b=u, es he en, dit, 
a 


a 


with which we get 


[ R(Varr3.2) dx = * f R(u, “— =)urdu= f F(u) du 


where R(u) is a rational function. This last integral can be computed with the 
techniques explained above. 


Integrals of the Form { R(e*”)da. The obvious substitution u = ¢*” gives 
du = e** dx and dx = du/ (Au), and the resulting integral is that of a rational 
function. 


Example. 


i; dx f dx / du 

——— > —_________ = 2 Le 

2+ sinhx 2+ (e™ —e-*)/2 u2?+4u-—1 
=f du 1 u+2-v5_ 1 nev 
ae ea 


= — In ———— 


ut22—5 VB "us2dvs VB e@424V5" 


Here we have used the formula of Exercise 5.1 below. 


Integrals of the Form /{ R(sinx, cos x, tanx)dx. We know from antiquity 
(Pythagoras 570-501 B.C., see also R.C. Buck 1980, Sherlock Holmes in Babylon, 
Am. Math. Monthly vol. 87, Nr. 5, p: 335-345) that the triples (3, 4, 5), (5, 12, 13), 
(7,24, 25),..., satisfy a?-+b? = c? and are of the form (u, (u?— 1)/2, (u2 +1)/2). 
This susoeee the substitution (Euler 1768, Caput V, §261) 
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Qu 1-—u? Qu 


(5.21) sinx = Thue’ cosx = Tapa’ To’ 


One verifies that sin x = u(1+ cos 2), so that 
the point (cos x, sin x) lies at the intersection 
of the line 7 = u(1 + €) with the unit circle 
(see the figure). Consequently, we have u = 
tan(z/2), « = 2arctanu, and 


2 


dz = ——— 
7 1+ u2 


du. 


All this inserted into [ R(sin x, cos x, tan x)dx 
provides an integral of a rational function. 


/ dx =| 2du =| du 

2+sne J (l+u2)\(2+ p45) J wWtutl 

The last integral is known from Eq. (4.18), thus, 

i oy 2 arctan((u+5)) a arctan ( (ta =+5)) 
——_ = — arctan| —=(u+-—) } = — arctan| —=(tan-—+-]) }. 
2+snz V3 V3 2 V3 V3 De 2 


Integrals of the Form { R(Vaa? + 2bx + c, x) dx. The idea (Euler 1768, § 88) 
is to define a new variable z by the relation ax? + 2bx + c = a(x — z)?. This 
yields the substitution 


Example. 


az*—c¢ a(az? + 2bz + c) 


=suaa OS eee 
Pik Dh 
(5.22) Vax? + 2be +e=+V/a(z 6) = Va 
2=at Var? +a +e/ Va, 


and we again get an integral of a rational function. For a < 0 this leads to complex 
arithmetic, which can be avoided by the transformation of Exercise 5.3. 

Sometimes it is more convenient to transform the expression Vax? + 2ba + c 
by a suitable linear substitution z = ax + ( into one of the forms 


22+], 22-1, V1—2z?. 


Then, the substitutions 


x 


(5.23) z= sinhu, z=coshu, z=sinu 


can be applied to eliminate the square root in the integral. 
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Example. Consider again the integral (4.27). Putting x = sinh u, we get 


1 sh2 sinh 2 
[ Ve Fiae= f cost*uan= [(5+> “) fy eg ZS 7 


2 2 4 
= Z a sin woos _ : nGs Pet) f aia 


For the inverse function of x = sinh u see Exercise I.4.3. 


Exercises 


5.1 (Joh. Bernoulli, see quotation at the beginning of this section). Prove that 


/s= ee ee 


w2—a* 2% «eta 


| b 
5.2 Show that / r( , ee c) dx is an elementary function. 
ex + f 


5.3 (Euler 1768, Caput II, §88). Suppose that ax? + 2bx + c has distinct real 
roots a, 3. Show that the substitution z? = a(x — 3)/(x — a) transforms the 


integral 
pelv ax? + 2ba +c, x) dx 


(R is a rational function of two arguments) into [ R(z) dz, where R is ratio- 
nal. 


5.4 Mr. C.L. Ever simplifies Eq. (5.16) with the help of (1.4.32) to 
/ dx V2. t+vV2r4+1 V2 oV/2 


= — In ————_ + — arctan —_; 
ct+i1 8 YG = (e+ I Gp ONE ae 


and obtains, e.g., 


ce dx V2 
0 


2 
ai In5+ ch arctan(—2) = —0.1069250677, 
a negative value for the integral of a positive function. Where did he make a 
mistake and what is the correct value? 


5.5 Compute 
/ dx 
Va24+1 
twice; once with the substitution (5.22) and once with the substitution (5.23). 
This leads to the formula arsinh « = In(# + Vx? + 1) (see Exercise 1.4.3). 


5.6 Prove that 
: R(sin? x, cos x, tana) dx 


can be integrated with the substitution 


2 


sin? xz = cos? 7 = tanz = u. 


1+u2’ 1+ u2’ 
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II.6 Approximate Computation of Integrals 


... because after all these attempts, analysts have finally concluded that one 
must abandon all hope of expressing elliptical arcs with the use of algebraic 
formulas, logarithms and circular arcs. 

(Lambert 1772, Rectification elliptischer Bégen ..., Opera vol.I, p. 312) 


Although the problem of numerical quadrature is about two hundred years 
old and has been considered by many geometers: Newton, Cotes, Gauss, 
Jacobi, Hermite, Tchébychef, Christoffel, Heine, Radeau [sic], A. Markov, 
T. Stitjes [sic], C. Possé, C. Andréev, N. Sonin and others, it can neverthe- 
less not be considered sufficiently exhausted. (Steklov 1918) 


One easily convinces oneself by our method that the integral f ode, 


which has greatly occupied geometers, is impossible in finite form .. . 
(Liouville 1835, p. 113) 
In spite of the extraordinary results of the previous sections, many integrals re- 
sisted the ingenuity of the Bernoullis, of Euler, of Lagrange, and of many others. 
Amongst these integrals, we note 


/ iy i e* dx dx 
e” dz, ; —, 
x Ina 


——— i 1 — k* cos? x dz, ns 
Jie? — 928 — 9s (= 2) — Wa?) 
The last three are so-called “elliptic integrals”. Legendre, Abel, Jacobi, and Weier- 
strass devote a great deal of their work to the study of these integrals. The above 
integrals cannot be expressed in finite terms of elementary functions (Liouville 
1835, see quotation), and we are confronted with new functions that have to be 
computed with new methods. 

We consider three approaches: (1) series expansions; (2) approximation by 
polynomials (numerical integration); and (3) asymptotic expansions. 


Series Expansions 


The idea is to develop the function into a series (either in terms of powers of x, or 
in terms of other expressions) and to integrate term by term. A justification of this 
procedure will be given in Sect. III.5 below. 


Historical Examples. The computations of Mercator (see Eq. (1.3.13)) 
2 3 


1 
nda) = f de= f(1-e+e?-...)dr=0- S45 -... 


are the oldest example. The computation of the length of an arc of the circle y = 
V1 — «x? (see Eq. (4.9) and Theorem I.2.2) 


x x #2 x 
aresine =f vity ora = | yitpope= (1— #7)? dt 
0 0 = 0 


=f (Q+5e+ e+ )at = dips ae alae 


is precisely Newton’s approach to Eq. (1.4.25). 
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Perimeter of the Ellipse. We wish to compute the perimeter of the ellipse with 
semiaxes 1 and b: 


45 =1 or x=cost, y=bsint. 
Since dx = — sint dt and dy = bcost dt, the perimeter is 


Qn me [2 
eal Vat rap = | V sin” t + b? cos? t dt 
0 0 
n/2 
=4/ /1— (1 — 6”) cos? t dt. 
0 —— 
a 


This is an “elliptic integral” (whence the name), which is not elementary. We 
compute it as follows: suppose that 1 > b > 0, thus 0 < a < 1. The idea is to use 
Newton’s series for \/1 — x (Theorem I.2.2), 


(6.1) 


ee ai (2448 
62 eae at eee ee 
\e2) Ss SMa EY cae oe Wee 


which gives 

me [2 1-1 
(6.3) pa | (1— 5 c0s?¢ — => a? costt — ...) dt. 
With the techniques of Sect. II.4 (see Eq. (4.28)), we find that 


[ cosn tat =F. 1-3-5-...+(2n—1) 
j GP: Ae62Ony * 


and (6.3) becomes (cf. Euler 1750, Opera, vol. XX, p. 49) 


—Ol 


(6.4) P= 2n(1 sige 


Lge 138) aie T2os5 ) 
x5 ee 


G4 FA Bed G DAG 


The convergence of this formula is illustrated in Fig. 6.1. For a = 0 (1.e., b = 1) 
we have a circle, and P = 27. For a = 1 (i.e., b = 0) the series converges very 
slowly to the correct value, 4. 


Fresnel’s Integrals. The Fresnel Integrals (Fresnel 1818), 


(6.5) a) = f cos(u”) du, ut) = sin(u”) du, 


have interesting properties (Exercise 6.4) and produce, in the (, y) plane, a beau- 
tiful spiral (Fig. 6.2). They are not elementary. However, the functions sin(u?) and 
cos(u”) have a simple infinite series (the series of sin z and cos z where z = u?; 
see (1.4.16) and (1.4.17)), of which we evaluate the integral term by term, as fol- 
lows: 
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a 


4 | b 


0 5 1.0 


FIGURE6.1. Convergence of the series 
(6.4) (perimeter of the ellipse) 


FIGURE 6.2. Fresnel’s Integrals 


£3 t" fil 
du = — — —— -... 
og = Pagh ee 
t 19 t13 


— 


ee ee 


The convergence of these series is illustrated in Fig. 6.3. The results are excellent 
for small values of t. For increasing values of |t|, more and more terms need to be 


taken into account. 


p 1 ) 


5 13 


FIGURE 6.3. Fresnel’s Integrals by power series; the numbers 5,9, 13 and 7, 11, 15 indicate 


the last power of t taken into account 


Numerical Methods 


Suppose we want to compute the integral fe f (x)dx, where the integration inter- 
val is given. The idea is the following: we fix N, subdivide the interval [a, b] into 


N subintervals of length h = (b — a)/N, 


T=a, t=at+h, ... x =atih, ... tn =), 


and replace the function f () locally by polynomials that can easily be integrated. 
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Trapezoidal Rule. On the interval [x;, 7;+1], the function f(x) is replaced by 
a straight line passing through (2;, f(a;)) and (241, f(ai+1)). The integral be- 
tween x; and 2,41 is then approximated by the trapezoidal area h - ( f(ai) + 
f(xi41)) /2 and we obtain 


F(x) de = > F(f lea) + Flees)) 
7 i=0 
(6) = (LSP + ples) + fea) +. + Heya) + EY). 


Example. The upper pictures of Fig. 6.4 show the functions cos x? and sin 2? to- 
gether with the trapezoidal approximations (step size h = 0.5, N = 10). The 
points of the lower pictures represent approximations to Fresnel’s Integrals ob- 
tained with h = 1/2 and h = 1/8; the corresponding values are connected by 
straight lines. 


A cos x2 
— 


of cos t? dt fl sin t? dt 

FO pet : _ IE vo 
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FIGURE 6.4. Fresnel’s Integrals by the Trapezoidal Rule 


Simpson’s Method (named after Simpson 1743). The idea is to choose three suc- 
cessive values of f(x; ) (y; = f(a;)) and to compute the parabola of interpolation 
through these points (see Theorem I.1.2 and Eq. (2.6)): 


P(x) = Yo + (0 — 29) + (x= to)(w = #1) A*yo 


2 h? 


With the substitution x = x9 + th, the area between the x-axis and this parabola 
becomes 


0 


"9 A= 1). a 
pia) dx =2h-yo th] tdt-Ayoth | — ~——dt- Ayo 


h 
= = (yo + 4y1 + 92). 
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We find Simpson’s Rule (NV even) 
(6.8) 


b 
ff Sloyae = F(t(20) +47 (1) +24 02) +4y 29) +2f (04) +--+ Fen). 


Newton-Cotes Methods. Taking higher degree interpolation polynomials, we 
find, in the same way, 


fle) da = (Fla) + 3f a1) + 8f le) + Flas) 


xo 


at 2h 

fw) deem = (TF(o) + 82f(w1) + 12f (w2) + 32f (#2) + TF(a4)), 
xo 
and so on. The first one, due to Newton (1671), is called the 3/8-rule. In 1711, 
Cotes computed these formulas for all degrees up to 10 (see Goldstine 1977, 
p.77). 


Numerical Examples. We compute approximations of [- fr oe = In(10) with the 
above methods for NV = 12, 24, 48,.... The results are presented in Table 6.1. We 
observe a genuine improvement only in every second column (for an explanation, 


see Exercise 6.5). 


TABLE6.1. Computation of J ie with different quadrature formulas 


Trapezoid Simpson Newton Cotes 

2.34 2.307 2.31 2.305 

2.31 2.303 2.303 2.3027 

2.305 2.3026 2.3026 2.30259 

2.303 2.302587 2.30259 2.3025852 

2.3027 2.3025852 2.3025854 2.302585095 
2.3026 2.302585 1 2.3025851 2.3025850930 
2.3025, 2.302585093 2.302585094 2.3025850929947 


2.302587 —2.3025850930 2.3025850930 2.30258509299405 
2.3025858 2.302585092996 = 2.302585092999 = 2.3025850929940458 
2.3025852 2.3025850929941 2.3025850929943 2.302585092994045686 


An interesting phenomenon can be observed when applying the trapezoidal 
tule to the elliptic integral P = ” /1 — acos? t dt (here with b = 0.2, a = 
0.96, see Table 6.2). It converges much better than expected. The reason is that the 
function f(t) is periodic and the “superconvergence” is explained by the Euler- 
Maclaurin formula of Sect. II.10. 
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TABLE 6.2. Computation of an elliptic integral with the trapezoidal rule 


Trapezoid 


4.1 

4.201 

4.2020080 

4.20200890792 

4.20200890793780018891 
4.2020089079378001889398329 176947477824 


Asymptotic Expansions 


This method was used by Laplace (1812) for ah e~ dt (see Oeuvres, tome VII, 
p. 104 and Exercise 6.7) and by Cauchy in 1842 for Fresnel’s integrals (see Kline 
1972, p. 1100). Whereas series expansions and numerical methods are useful for 
small and moderate values of x, the method of asymptotic expansions is especially 
adapted for large x. 

We illustrate this technique on the example of Fresnel’s integrals. For the 
limiting case x — oo the exact value of the integral is known to be (Exercise 
IV.5.14) 


[oe) [oe 1 
(6.9) i. cost? dt = i sin t?dt = = ie 
0 0 2V 2 


The idea is now to split the integral according to [~ = [,° — J", ie. 


zx 1 co 
(6.10) | cos t? dt = a5 _ i cos t? dt. 
0 2 2 x 


To the integral on the right, we artificially add the factors 2¢ and 1/(2t) and apply 
integration by parts with u(t) = 1/t, v(t) = sint?. This yields 


Lf 71. 11 Lf 
-| cos dt = 5 | = . 2tcost? dt = szsinet 5 | zz sint? dt. 
x x x Ma 


We find an integral that appears by no means easier than the first one. However, 
for x large, the integral on the right, which contains the additional factor 1/t?, 
is much smaller than the original one. Therefore, (2x)~1 sin x? will be a good 
approximation for — e cost? dt. If the precision is not yet good enough, we 
repeat the same procedure (here with u(t) = 1/t® and v(t) = — cost?), 


Po i yak 1- 1 
(6.11) -/ — sin t? dt = ——~ — cos x? + —~ — cost? dt. 
x “40 . x 
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Continuing like this, we find from (6.10) that 


[ costar= 5 ligarse : Foe eae Edna 
5 ON De 2-2 23 2-2-2 25 


1-3-5 PASE ORT Ed 4 


4 pad eee 
Dad gt Oo” 59995948 


(6.12) + 


An analogous formula is valid for 


” 1 11 1 1 1-3 1 
[ sintat= 5 Oe ipa a a i Cea? 
0 


aV2 22 2.2 3 22-29 
1-395 1 Lope og a 
6.13 i ee eee a 
Ooe) Toe  Oeeage ae 


The extraordinary precision of these approximations for large x is illustrated in 
Fig. 6.5. The numbers 1,3, 5 indicate the last power of 1/a taken into account. 


I 
an 
an 

I 


FIGURE6.5. Asymptotic expansions (6.12) and (6.13) with 1, 2, 3, 10, 20, and 30 terms 


(6.1) Remark. The error of the truncated series (6.12) can easily be estimated. For 
example, if we truncate after the term (27)~' sin x, the above derivation shows 
that the error is given by the value of the integral in (6.11) (taken over x < t < oo). 
Using | cost?| < 1 this yields the estimate (2x7°)~1, which, for x > 2, is less than 
0.0625. 


(6.2) Remark. The infinite series (6.12) and (6.13) do not converge for a fixed x. 
The reason is that the general term contains the factor 1-3-5-7-9-... in the 
numerator, which dominates all other factors. Such series were called asymptotic 
expansions by Poincaré. 


Exercises 


6.1 (Joh. Bernoulli 1697). Derive the “series mirabili” 


1 
: hy We 


6.2 


6.3 


6.4 


6.5 
6.6 


6.7 


6.8 
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Hint. Use the series for the exponential function in x” = e* ™* 


fx" (Inx)” dz by integration by parts. 


The integral [ x?dx/V/1 — x4 was encountered by Jac. Bernoulli in his com- 
putation of the elastic line and by Leibniz in his study of the [sochrona Para- 
centrica. Verify the formula (Leibniz 1694b) 


and compute 


x? dx 1 3 1 7 1-3 rl 1-3-5 15 
ba EE a de ee oe ge LAB 
Qa 8 Ee ee ees 


As in (6.7), derive the formulas of Newton and Cotes by integrating the inter- 
polation polynomials of degree 3 and 4 on the intervals [xo, x3] and [29, x4], 
respectively. 

For the curve defined by (6.5) (see Fig. 6.2) prove that 

a) the length of the arc between the origin and (x(t), y(t)) is equal to ¢; and 
b) the radius of curvature at the point (x(t), y(t)) is equal to 1/(2t). 


Prove that Simpson’s method is exact for all polynomials of degree 3. 


1 
1 
| n(1+ <2) de 
0 1+ 2x? 


with the help of Simpson’s method. Study the decrease of the error with 
increasing N. 
Result. The correct value is (7/8) In 2 = 0.2721982613. 
Using ihe e-? dt = /7/2 (see (IV.5.41) below), derive an asymptotic ex- 
pansion for the error function B(x) = Fr i e~© dt that is valid for large 
values of x (Laplace 1812, Livre premier, No. 44). 
2 
CR 71 1 1-3 1-3-5 
Rodi 2a) yh (7 rae ee peer 
Compute numerically the integral 
mV/2V/2+ V2 


— cosx* dx = ————__ ® 1.674813394. 
| veo Ay 7-1/4) 67481339 


Compute 


Choose two numbers A ~ 1/10 and B ~ 10 and compute the integral 
a) on the interval (0, A] by a series; 

b) on the interval [A, B] by Simpson’s method; and 

c) on the interval [B, 00) by an asymptotic expansion. 
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II.7 Ordinary Differential Equations 


Ergo & horum integralia aequantur. (Jac. Bernoulli 1690) 


In Sects. II.4 and II.5, we treated the problem of finding a primitive of a given 
function f(z), ie., we were looking for a function y(x) satisfying y’(x) = f(x). 
Here, we consider the more difficult problem where the function f may also de- 
pend on the unknown function y(x). An ordinary differential equation is a relation 
of the form 


4) y = f(x,y). 


We are searching for a function y(a) such that y/(x) = f(a, y(a)) for all x ina 
certain interval. Let us begin with some historical examples (for more details, see 
Wanner 1988). 


The Isochrone of Leibniz. Galilei discovered that a body, falling from the origin 
along the y-axis, increases its velocity according to v = ./—2gy, where g is the 
acceleration due to gravity. During his dispute with the Cartesians about mechan- 
ics, Leibniz (in the Sept. 1687 issue of the journal Nouvelles de la République des 
lettres) poses the following problem: find a curve y(a) (see Fig. 7.1) such that, 
when the body is sliding along this curve, its vertical velocity dy /dt is everywhere 
equal to a given constant —b. 


1 

S, 2! 

| | 

rk 

ol dx 
2Qgy zt 
eel roy 

So 


ae ». 


/ 
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FIGURE 7.1. Leibniz’s isochrone 


One month later, “Vir Celeberrimus Christianus Hugenius” (Huygens) gives 
the solution, “sed suppressa demonstratione & explicatione”. The “demonstratio”, 
then published in Leibniz (1689), is unsatisfactory, since the solution is guessed 
and then shown to possess the desired property. A general method for finding the 
solution with the help of the “modern” differential calculus was then published 
by Jac. Bernoulli (1690). This started the era of spectacular discoveries made by 
Jac. and Joh. Bernoulli, later by Euler and Daniel Bernoulli, and made Basel for 
several decades the world center of mathematical research. 

Let us write Galilei’s formula as 
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ds\2 dx? + dy? 
(7.2) (=) 4 re =-2gy  (s= arc length), 


divide by (dy/dt)? = +b? (which is the required condition), and obtain 


oe (S) +1-=—=2 or dy _ =. 


dy b? dx J—1— 2gy/b?’ 


a differential equation as in (7.1). In order to understand Bernoulli’s idea, we write 
(7.3) as 


D 
(7.4) = = 2G, 


which expresses the fact (see Fig. 7.1) that the two striped rectangles always have 
the same area. So Jacob writes “Ergo & horum Integralia aequantur” (this is the 
first appearence in mathematics of the word “integral’”), meaning that the areas S1 
and S» also have to be equal. After integrating, we find the solution 


and the “Solutio sit linea paraboloeides quadrato cubica . . .” (Leibniz). 


The Tractrix. 


The distinguished Parisian physician Claude Perrault, equally famous for 
his work in mechanics and in architecture, well known for his edition of 
Vitruvius, and in his lifetime an important member of the Royal French 
Academy of Science, proposed this problem to me and to many others be- 
fore me, readily admitting that he had not been able to solve it ... 
(Leibniz 1693) 


While Leibniz was in Paris (1672—1676) taking mathe- 
matical lessons from Huygens, the famous anatomist and 
architect Claude Perrault formulated the following prob- 
lem: for which curve is the tangent at each point P of co 


constant length a between P and the x-axis (Fig. 7.2)? To . 
illustrate this question, he took out of his fob a “horolo- As 
gio portabili suae thecae argenteae” and pulls it across —o—_—_»> 


the table. He mentioned that no mathematician from 
Paris or Toulouse (Fermat) was able to find the formula. 

Leibniz published his solution in 1693 (see Leibniz 1693), asserting that he 
had known it for quite some time, as 


dy y ; Ve —y d 


oe Jaa Le, 0 — 7 
one finds (“ergo & horum .. .”) the solution by quadrature (Figs. 7.2 or 7.3). Leib- 
niz asserts that it was “‘a well-known fact” that this area is expressible with the log- 
arithm, which, using the substitution \/a? — y? = v, a2—-y? = v?, —ydy = vdv, 


y = da, 
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RL 3 5K 
solution G 
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it) 
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7 gd 
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FIGURE 7.2. The tractrix FIGURE7.3. Sketch by Leibniz (1693)! 


a /q2 — 42 — ./q2 — 4,2 
(7.5) =f eg a a aegs 
y 


y 


turns out to be true (see also Exercise 7.1). We mention that Leibniz’s interest in 
this theory also went the other way around: use Perrault’s watch as a mechani- 
cal integration machine for the computation of integral (7.5) (and hence of loga- 
rithms) and design other mechanical devices for similar integrals. 


The Catenary. 


But to better judge the quality of your algorithm I wait impatiently to see 
the results you have obtained concerning the shape of the hanging rope or 
chain, which Mr. Bernouilly proposed that you investigate, for which I am 
very grateful to him, because this curve possesses remarkable properties. I 
considered it long ago in my youth, when I was only 15 years old, and I 
proved to Father Mersenne that it was not a parabola .. . 

(Letter of Huygens to Leibniz, Oct. 9, 1690) 


The efforts of my brother were without success, I myself was more fortu- 
nate, since I found the way ... It is true that this required meditation which 
robbed me of sleep for an entire night ... 
(Joh. Bernoulli, see Briefwechsel, vol. 1, p. 98) 
Galilei (1638) asserted that a chain hanging from two nails forms “ad unguem” 
a parabola. Some 20 years later, a 16 year old Dutch boy (Christiaan Huygens) 
discovered that this result must be wrong. Finally, the solution of the problem of 
the shape of a hanging flexible line (“Linea Catenaria vel Funicularis”) by Leib- 
niz (1691b) and Joh. Bernoulli (1691) was an enormous success for the “new” 
calculus. Here are Johann’s ideas (Opera vol. III, p. 491-493). 
We let B be the lowest point and A an arbitrary point on the curve (Fig. 7.4). 
We then draw the tangents AE and BE and imagine the mass of the chain of length 
s between A and B concentrated in the point E hanging on two threads without 
mass (“duorum filiorum nullius gravitatis”). Since the mass in E is proportional to 
s, the parallelogram of forces in E shows that the slope in A is proportional to the 
arc length, 1.e., 


' Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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G.G.L.De Linea Catenaria 
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oid 


FIGURE7.4. The catenary FIGURE7.5. Catenary (Leibniz 1691)” 


(7.6) cy =s. 


From here, Johann’s computations are very complicated, using second differen- 
tials (see Opera vol. III, p. 426). They become easy, however, if we replace, in the 
spirit of Riccati (see (7.21) below), the derivative y’ by a new variable p and have 
after differentiation 


(7.7) c:-dp=ds=v/1+p? dz, 
a differential equation between the variables p and «x. Integration gives 


wv — XO 


d 
-f = i ie., arsinh(p) = ; 
1+ p? c 


LL 


(7.8) p= sinh( °) and y=K +e-cosh(“—*), 


Cc ¢€ 


The Brachistochrone. 


Given two points A and B in a vertical plane, determine the path AM B 
along which a moving particle MM, starting at A and descending solely un- 
der the influence of its weight, reaches B in the shortest time. 

(Joh. Bernoulli 1696) 


This problem seems to be one of most curious and beautiful that has ever 
been proposed, and I would very much like to apply my efforts to it, but 
for this it would be necessary that you reduce it to pure mathematics, since 
physics bothers me... 

(de L’ Hospital, letter to Joh. Bernoulli, June 15, 1696) 


> Reproduced with permission of Bibl. Publ. Univ. Genéve. 
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Galilei proves in 1638 that a body sliding from A to C (Fig. 7.7) takes less time on 
the detour ADC than on the shortest path (due to its larger initial velocity). He con- 
tinues and proves that ADEC, ADEFC, ADEFGC are always quicker and finally 
concludes that the circle is the quickest of all paths. Hearing that his brother Jacob 
makes the same mistake, Johann (1696) seizes this as the occasion for organizing 
a public contest to find the brachistochrone line (Geayts = short, xedvos = 
time). The solutions handed in on time, including Jacob’s, were unfortunately all 
correct; nevertheless, Johann’s is the most elegant one: he makes an analogy to 
“Fermat’s Priciple” (see Eq. (2.5)): 


B A 
x 
D 
E 
F 
c 
C G 
FIGURE7.6. The brachistochrone FIGURE7.7. The wrong brachistochrone 


as seen by Galilei 


He thinks of many layers where the “speed of light” is given by v = /2gy 
(see (7.2) and Fig. 7.6). The quickest path is the one satisfying everywhere the law 
of refraction (Fermat’s principle), 


UV 


sin @ 
Hence, we have, because of sina = dz/ds, 
2 
y 


di _ So fd 
(7.9) Gea Va8U or dx = aay dy. 


Still in accordance with “ergo & horum integralia equantur’’, the substitution 
(7.10) y=c-sin?u =< — <cos2u 
202, 
leads to the formula 
(7.11) x— a =cu—5sin2u 


“ex qua concludo Curvam Brachystochronam esse Cycloidem vulgarem’’. 
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Some Types of Integrable Equations 


We now discuss some of the simplest types of differential equations, which can be 
solved by the computation of integrals. 


Equation with Separable Variables. 


(7.12) y’ = f(x)g(y). 


All of the preceding examples, namely, (7.3), (7.5), (7.7), and (7.9), are of this 
type. They are solved by writing y’ = dy/dx, by “separation of variables” and 
integration (“ergo & ...”), ie., 


dy dy 
(7.13) —~ = f(x) dz and [a-/t6 dz+C. 
aw) 1 g(y) Se 
If G(y) and F'(a) are primitives of 1/g(y) and f(a), respectively, the solution is 
expressed by G(y) = F(x) +C. 
Linear Homogeneous Equation. 
(7.14) y = flay. 


This is a special case of (7.12). Its solution is given by 


(7.15) iny= f f(e)de +7, or y=C-exp(f f(e)az). 


Linear Inhomogeneous Equation. 


(7.16) y’ = f(x)y + 9(2). 


Joh. Bernoulli proposes to write the solution as a product of two functions y(2) = 
u(a) - v(x) (like Tartaglia’s idea, Eq. (1.1.5)). We then obtain 


d d 
Te Ut Ge = fle)-u-v+g(). 
We can now equalize the two terms separately and find 
du : 
(7.17a) — = f(x)-u to obtain u, 
dx 
(7.17b) oy 5 aXe) to obtain v. 
dx u(x) 


Equation (7.17a) is a homogeneous linear equation for wu and its solution is given 
by (7.15). The function v(x) is then obtained by integration of (7.17b). Conse- 
quently, the solution of (7.16) is 


(7.18) y(a) =C-u(x) + u(z) i HO) yp, u(x) = ex( | f(t) dt). 


9 u(t) 
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This relation expresses the fact that the solution of (7.16) is a sum of the general 
solution of the homogeneous equation with a particular solution of the inhomoge- 
neous equation. 


Bernoulli’s Differential Equation. 


In truth, there is nothing more ingenious than the solution that you give for 
your brother’s equation; and this solution is so simple that one is surprised 
at how difficult the problem appeared to be: this is indeed what one calls an 
elegant solution. (P. Varignon, letter to Joh. Bernoulli “6 Aoust 1697’) 


In 1695, Jac. Bernoulli struggles for months on the solution of 


(7.19) y’ = f(x)-y+g(z)-y”. 


This is a good occasion for Jacob to organize an official contest. Unfortunately, 
Johann has straightaway two elegant ideas (see Joh. Bernoulli 1697b). The first 
idea is treated in Exercise 7.2. The second one is the same as explained above, 
namely to write the solution as y(x) = u(a) - v(x). For the differential equation 
(7.19) this again yields (7.17a) for u and 

d 
(7.20) — = g(x)u""a)o”, 

dx 
a differential equation that can be solved by separation of variables. This leads to 
the solution 


ule) =ula)(C =n) falter rear) 


where u(2:) is as in (7.18). 


Second-Order Differential Equations 


To free the above formula from the second differences, ... , we denote the 
subnormal BF by p. (Riccati 1712) 


A second-order differential equation is of the form 
y" = f(a,y,y'). 


The analytic solution of such an equation is very seldom possible. There are a few 
exceptions. 


Equations Independent of y. It is natural to put p = y’, so that the differential 
equation y” = f(a, y’) becomes the first-order equation p’ = f(x, p). We remark 
that the differential equation (7.7) of the catenary is actually of this type. 


Equations Independent of x. 


(7.21) y" = fiyy’)- 


The idea (Riccati 1712) is to consider y as an independent variable and to search 
for a function p(y) such that y’ = p(y). The chain rule gives 
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and Eq. (7.21) becomes the first-order equation 


(7.22) p-p=f(y,P). 


When the function p(y) has been found from (7.22), it remains to integrate y’ = 
p(y), which is an equation of type (7.12). 


Example. The movement of a pendulum 
(see the sketch by Leonardo da Vinci) is eee eerie Moe b ip pal hh 


described by the equation ugh ses 
i Ph ay 4 
(7.23) y” +siny =0 pis ‘a's fai 
‘ - 
(y denotes the deviation from equilibrium). Ee ane 1 
i 


Since Eq. (7.23) does not depend on t (we oe al 
write t instead of x, because this variable ple. {. vedi: SANS PA. sey NRY f 4 
. . . | 
denotes the time in this example), we can rug. neal b, cautuk|n «420 Mahe whew dun 
7 < wey. {H.C y - HAND wal Bake HV. pfawale ir 
use the above transformation to obtain 


©Bibl. Nacional, Codex Madrid I 147r 


2 
p:-dp=-—siny-dy and = cosy +6. 


If we denote the amplitude of the oscillations by A (for which p = y’ = 0) we 
have C' = — cos A and get 


d 
(7.24) p= a /2 cosy — 2cos A, 


which is a differential equation for y. Separation of the variables finally yields the 
solution expressed in implicit form with an elliptic integral 


¥ d 
(7.25) oo Es oe 
9 V2cosn — 2cosA 


(the integration constant is determined by the assumption that y = 0 for t = 0). 
If T is the period of the oscillations, the maximal deviation A is attained for 
t = T/4. Hence, the period satisfies 


A A 
(7.26) r=4f a? | ee. Si 
0 /2cosy — 2cos A 0 sin?(A/2) — sin?(y/2) 


We see that it depends on the amplitude A and is close to 27 if A is small (Exer- 
cise 7.5). 
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FIGURE7.8. The isochronous pendulum of Huygens 


The Isochronous Pendulum. The problem consists in modifying the standard 
pendulum in such a way that the period becomes independent of the amplitude. 
The idea of Huygens (1673, Horologium Oscillatorium) was to modify the cir- 
cle of the standard pendulum in such a way that the accelerating force becomes 
proportional to the arc length s. The movement of the pendulum would then be 
described by 


(7.27) s"+Ks=0, 

which has oscillations independent of the amplitude. 

Solution. We see from the two similar triangles in Fig. 7.8 (right) that the acceler- 
ating force is f = —dy/ds, so that our requirement f = — Ks becomes 

(7.28) dy = K-sds. 


If s = 0 for y = O (ie., the origin is placed in the lowest point) we obtain by 
integration 

Kk 2 
(7.29) y=>8 or = = 
Thus, for our curve the height is proportional to the square of the arc length 
(Joh. Bernoulli 1691/92b, p. 489-490). Inserting s from (7.29) into (7.28) gives 


4 = V2K Vda? + dy? 
y 


or, by taking squares, 


(7.30) (- -1) dy?=dx? and ,/2- 4% dy=de 
y y 


with c = 1/(2K). Apart from a shift in y, this is precisely equation (7.9) 
for the brachystochrone, and we see that the isochrone pendulum is a cycloid 
as Joh. Bernoulli (1697c) said: “animo revolvens inexpectatam illam identitatem 
Tautochronae Hugeniae nostrae que Brachystochronae” (see Fig. 7.8). 
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Exercises 


7A 


7.2 


73 


74 


75 


7.6 


7.7 


Compute the integral (7.5) for the tractrix with the substitution y = acost, 
insert sin? t = 1 — cos? t, and apply the substitution (5.21). 


(Joh. Bernoulli 1697b). Solve the differential equation “de mon Frére” 
(7.31) y =g(x)-yt f(x)-y" 

by using the transformation y = v?. Determine the constant ( such that 
(7.31) becomes a linear differential equation for v. 


The logistic law of population growth is given by the differential equation 
(Verhulst 1845) 

y' = by(a—y), 
where a, b are constants. Choose a = 5, b = 2 and find the solution satisfying 
y(0) = 0.1. 
Show that a differential equation of the form 


v=e(2 


can be solved by the substitution v(a) = y(a)/x. Apply this method to 


,  9u+2y 
— Qe ty ” 


The solution of the pendulum equation 
y +w?siny = 0, 
corresponding to initial values y(0) = A, y’(0) = 0, has the period 
-1/2 


T= 2 f° (snt(a/2) — sin?) are 


(see Eq. (7.26)). Set k = sin(A/2), apply the substitution sin(y/2) = k - 
sin a, and compute the first terms of the expansion of T’ in powers of k. 


2 2(1)2, 1.4(/1:3)? i A? , 11A* , 173A® 
Result. 22 (144 (3)"+h4($3)"4...) = 22 (144544 4 Pa), 


Solve the differential equation 


ae y? 


~ 44a?" 


y 


The motion of a body in the earth’s gravitational field is described by the 


differential equation 
” g BR? 


Vga? 


y 
where g = 9.81 m/sec”, R = 6.36 - 10° m, and y is the distance of the 
body to the center of earth. Determine the constants in the solution such that 
y(0) = Rand y’(0) = v. Then, find the smallest velocity v for which the 
body will not return to earth (escape velocity). 
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II.8 Linear Differential Equations 


... itis today quite impossible to swallow a single line of d’ Alembert, while 
most writings of Euler can still be read with delight. 
(Jacobi, see Spiess 1929, p. 139) 
Let ao(x), a1(),...,@n—1(x) be given functions. We call 
(8.1) y™ + dn—i1(x)y") +... + a1(a)y’ + ao(x)y = 0 
a homogeneous linear differential equation of order n and 


(8.2) y) + dn—1(x)y") +... +.ar(x)y! + ao(x)y = f(z) 


an inhomogeneous linear differential equation. For the left-hand side of these 
equations we introduce the abbreviation 


(8.3) L(y) :=y™ + an_a(a)y) +... + ao(a)y, 

so that (8.1) and (8.2) become 

(8.4) Liy)=0 and Liy)=f, 

respectively. We call £ a differential operator. It operates on functions y(x), and 


the result £(y) is again a function, given by (8.3). The main property of this oper- 
ator is that it is Jinear, i.e., 


(8.5) L(crys + coy2) = L(y) + e2£L(y2). 


An obvious consequence of this linearity is the following result. 


(8.1) Lemma. Given n solutions yi(x), yo(@),..-,Yn(x) for the homogeneous 
equation (8.1), then for arbitrary constants cy, ..., Cn, the function 
(8.6) cryi (x) + C2y2(x) +... + CnYn(x) 


is also a solution of the same equation. 


Remark. The solutions of the equations of order | involve one constant (see 
Sect. II.7) and the equations of order 2 have two arbitrary constants (see, for ex- 
ample, Eq. (7.23)). Arguing by analogy, we can assume (Euler) that the equa- 
tions of order n have n constants and that (8.6) is the general solution of 
(8.1), if yi (a), ..., Yn (x) are linearly independent functions. Here, the functions 
yi(x),--.,Yn(x) are called linearly independent if the linear combination (8.6) 
vanishes identically only in the case when all c; are zero. For example, 1, x, x”, x? 
are linearly independent functions. 
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(8.2) Lemma. 


General solution of the homogeneous equation (8.1) 
+ 
one particular solution of the inhomogeneous equation (8.2) 


general solution of the inhomogeneous equation (8.2). 


Proof. Let ¥ be a particular solution of (8.2), i.e., £(y) = f. For an arbitrary 
solution y of (8.1) (i.e., L(y) = 0) we then have L(y + y) = f by (8.5), so that 
y + y is a solution of (8.2). 

On the other hand, if % is another solution of (8.2) (i.e., £(y) = f) then, 
again by (8.5), we have L(y — y) = Oand y = y + (y — ¥) is the sum of y anda 
solution of the homogeneous equation (8.1). 


Conclusion. In order to solve the differential equations (8.1) and (8.2), one has to 
— find n different solutions (linearly independent) of (8.1), and 
— find one solution of (8.2). 


Homogeneous Equation with Constant Coefficients 


The complete solution of Eq. (8.1) is very seldom possible. However, there are a 
few exceptions. The most important one is when the coefficients a;(a) are inde- 
pendent of z, ie., 


(8.7) y™ + any) +... + ary! + aoy = 0. 


Another exception is when a;(a) = a;x*~" (“Cauchy’s Equation’”’). This case will 
be considered at the end of this section. 

The essential idea for solving (8.7) (Euler communicated it on Sept. 15, 1739 
in a letter to Joh. Bernoulli and published it in 1743) is to search for solutions of 
the form 


(8.8) y(a) =e, 
where is a constant to be determined. Computing the derivatives 
y' (x) = re”, yy ME. hex 4 y™ (x) = "er, 
and inserting them into Eq. (8.7), yields 
(8.9) (A” + dpe ne” ee a ag )e** =0. 


Hence, the function (8.8) is a solution of (8.7) if and only if \ is a root of the 
so-called characteristic equation 


(8.10) XA) =0, xX(A) =A” Han 1A" +... + ad +40. 
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Distinct Roots. If Eq. (8.10) has n distinct roots, say A1,... , An, then ene. 
ern® are n linearly independent solutions of (8.7) (see Exercise 8.1). The general 
solution is thus given by 


(8.11) y(x) = ae ee OE eee 


Multiple Roots. Consider first the simple differential equation 
(8.12) y™ =0, 


where the characteristic equation \” = 0 has a root zero of multiplicity n. Obvi- 
ously, the general solution of (8.12) is cy +coute3u27+...+¢,2"~1,a polynomial 
of degree n — 1. 

Next, we study the equation 


(8.13) yl" _ 3ay" + 3a7y' = a®y = 0, 


where the characteristic equation (A — a)? = 0 has the root a of multiplicity 3. 
We introduce a new unknown function u(x) by the relation (Euler 1743b) 


(8.14) y(x“) = e** - u(x). 


Then, differentiating this relation three times and inserting the results into (8.13), 
we obtain for u Eq. (8.12) with n = 3. Therefore, the general solution of (8.13) is 
given by 


(8.15) y(a) = e% . (c1 + cox + C32”). 


Differential Operators. The above calculations become particularly elegant if we 
introduce, for a given constant a, the differential operator D, by 


(8.16) Day =y' —a-y. 
The composition of two such operators D, and Dy gives 
(8.17) DeDay = (y’ — ay)’ — b(y! — ay) = y" — (a + b)y! + aby = Da Dpy. 


We observe that D, and D, commute and that D,D,D....y = 0 is the differ- 
ential equation (8.7) whose coefficients are those of the characteristic polynomial 
(A — a)(A — b)(A — ec)... . Therefore, Eq. (8.13) is the same as 


(8.13’) D0: 
Applying D, to (8.14), we obtain 
ax / 


/ 
Day = ae -u+e-w — ae -u=e-wu, 
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D2y = e* - u!’, and finally D?y = e** - u(). This verifies that (8.15) is the 
general solution of (8.13). 


(8.3) Theorem (Euler 1743b). Suppose that the characteristic polynomial (8.10) 
has the factorization 


x0A) = (A An) (A= An)? + (A An) 
(with distinct d;), then the general solution of (8.7) is given by 
(8.18) y(x) = pi(x)e™* + po(ax)er2” +... + pp(x)e***, 
where the p;(x) are arbitrary polynomials of degree mj — 1 (this solution involves 


: k 
precisely ));_, ™; =n constants). 


Proof. We illustrate the proof for the case of two multiple roots y(A) = (A — 
a)?(\ — b)*. Because of the permutability of D, and Dy, we can write the differ- 
ential equation either as 


(8.19) D3 D3y =0 or as D? Dey = 0. 


The solution y = e*” - (c1 +c2@+ c3x") of D3y = 0 is seen to be reduced to zero 
by the left-hand version of (8.19); the solution y = e?” - (cq + c5a + cgx? + c7x*) 
of Diy = 0 is annuled by the right-hand version. Both are therefore solutions and 
have together seven free constants (see Exercise 8.2 for the linear independence 
of the functions involved). 


Avoiding Complex Arithmetic. The result of Theorem 8.3 is valid also for com- 
plex ;. If, however, the coefficients a; of Eq. (8.7) are real, we are mainly in- 
terested in real-valued solutions. The fact that complex roots of real polynomials 
always appear in conjugate pairs allows us to simplify (8.18). Let Ay = a +728 
and Az = a — if be two such roots. The corresponding part of the solution (8.18) 
is then a polynomial multiplied by 


(8.20) eo” (c,e°* + cae") ; 

Using Euler’s formula (1.5.4), this expression becomes 

(8.21) e* (d; cos Bx + dz sin Bx), 

where d, = ci + cg and dg = i(c; — cz) are new constants. This expression can 


be further simplified by the use of dz + id; = Ce’? = Ccosy +iC siny. We 
then get with Eq. (1.4.3) (see Fig. 8.1) 


Ce (sin y cos bx + cosy sin Bur) = Ce sin (Gx + y). 
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a>0 


a<0 


FIGURE 8.1. Stable and unstable oscillations 


Example. Equation (7.23) of the pendulum can, for small oscillations, be simpli- 
fied by replacing sin y by y, and becomes 


(8.22) y +w*y =0, w? = g/£, 


where g = 9.81m/sec? and £ is the length of the rod. The characteristic equation 
\? + w? = 0 has the roots +iw. Hence, the general solution of (8.22) is 


y(t) = Csin(wt + y), 


which has period 


T = 2n/w = 2n/b/g. 


Inhomogeneous Linear Equations 
The problem consists in finding one particular solution of L(y) = f, ie., 
(8.23) y™ + any) +... tary! + aoy = f(a). 


As an immediate consequence of the linearity of (8.5), we have the following 
result. 


(8.4) Lemma (Superposition Principle). Let yi1(a) and y2(x) be solutions of 
L(yi) = fi and L(y2) = fo, then cyy1(x) + cay2(x) is a solution of L(y) = 
efi + cof. 


In situations where the inhomogeneity f(z) in (8.23) can be split into a sum 
of simple terms, the individual terms can be treated separately. 


The Quick Method (Euler 1750b). This approach is possible if f(a) is a linear 
combination of x’, e*”, e® sin(wa),...; more precisely, if f(a) itself is a solu- 
tion of some homogeneous linear equation with constant coefficients. The idea is 
to look for a solution with the same structure. 


Example. Consider a case where f is a polynomial of degree 2, e.g., 
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(8.24) yl” + Sy” + Qy +y = 207? +2. 

We will search for a solution of the form 

(8.25) y(x) =a+ br + cz’. 

Computing the derivatives of (8.25) and inserting them into (8.24) yields 
ca” + (b+ 4c)x + (a + 2b + 10c) = 2x? + x. 


Comparison of the coefficients gives c = 2, b = —7 anda = —6, so that a 
particular solution of (8.24) is 


y(x) = 2x” — 7x — 6. 
Example. Suppose now that f(a) is a sine function 
(8.26) y’ —y +y =sin2z. 
It is not sufficient to take y(x) = a- sin 2a, because y’ also produces cos 22. 
Therefore, we put 
(8.27) y(x) =a-sin2z + b-cos2z, 
compute the derivatives, and insert them into (8.26). This gives the condition 
(a + 2b — 4a) sin 2x + (b — 2a — 4b) cos 2a = sin 2x. 


We obtain the linear system —3a + 2b = 1, —2a — 3b = O with the solution 
a = —3/13, b = 2/13. Consequently, the particular solution is 


3 2 
(8.28) y(x) = aa sin 2a + Tg 008 Qn. 
Another possibility for solving (8.26) is to consider the equation 
(8.29) y" = y’ +y= eztt 


and to search for a solution of the form y(x) = Ac?'”. Inserting its derivatives 
yields —4A — 21A+ A = 1and A = (—3+ 2i)/13. Hence, the solution of (8.29) 
is 

—3+2% 9; 
8.30 = a 
(8.30) y(x) ig 
Since (8.26) is just the imaginary part of (8.29), we get a solution of (8.26) by 
taking the imaginary part of (8.30). 


Justification of This Approach. By assumption, f(a) satisfies £1(f) = 0, where 
L, = DD}... is some differential operator with constant coefficients. Apply- 
ing this operator to Eq. (8.23), ie. L(y) = f, we get (L1L)(y) = 0, and the 
solution of (8.23) is seen to satisfy the linear homogeneous differential equation 
(£,L£)(y) = 0. The general solution of this equation is known by Theorem 8.3. 
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FIGURE 8.2. Solution for y” + y= sinwz, y(0)=0, y’(0) = 1, w = 1.09, 1.03, 1.015, 1. 


Case of Resonance. Consider, for example, the equation 
(8.31) y” +y =sing. 


Here, we cannot take y(x) = asinz + bcosz, because this function is itself a 
solution of the homogeneous equation. Inspired by the discussion on double roots 
(see also Fig. 8.2), we try 


(8.32) y(x) = axsin« + ba cosa. 
The usual procedure (inserting the derivatives of (8.32) into (8.31)) yields 
2acosxz — 2bsinx = sina, 


so that a = 0 and b = —1/2. A particular solution of (8.31) is thus 


1 
(8.33) y(x“) = — 3 t cose. 


It explodes for x — oo (see Fig. 8.2). 


Method of Variation of Constants (Lagrange 1775, 1788). This is a general 
method that allows us to find a particular solution of (8.2) in the case where the 
general solution of the homogeneous equation (8.1) is known. In order to simplify 
the notation, we explain this method for the case n = 2. 

Consider the problem 


(8.34) y" +a(a)y’ + b(x)y = f(x) 


and assume that y(a) and y2(x) are two known independent solutions of the 
homogeneous equation y” + a(x)y’ +b(x)y = 0. The idea is to look for a solution 
of the form 


(8.35) y(x) = c1(x) yi (x) + c2(x)yo(z) 


(hence the name “variation of constants’’). The derivative of (8.35) is 
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(8.36) y= cy + chyo + ery + C22. 
In order to avoid complications with higher order derivatives, we require that 
(8.37) cy + Coy2 = 0 


so that the derivative of (8.35) becomes y’ = cyy} + cay). The second derivative 
then becomes 


(8.38) y! = cy, + yg + ery + coys. 


If all these formulas are inserted into (8.34), the terms containing c; and C2 disap- 
pear, because we have assumed that y; (7) and y2(x) are solutions of the homoge- 
neous equation. All that remains is 


(8.39) cy, + cous = f(z). 
This, together with (8.37), constitutes the linear system 


eo (He) Bes) (28) = Gea): 
W(z) d(x) F(z) 


( 
The matrix W(x) is called the Wronskian. Computing c’(x) from (8.40) and inte- 
grating yields 


c(x) = a W-1(t)F(t) dt, 


and a solution of (8.34) is given by 


(8.41) y(x) = (y1(2), y2(z)) (a) = [ww y2(x)) W(t) F(t) dt. 


C2(x) 
Example. Consider the equation with constant coefficients 
(8.42) y + 2ay’ + by = f(z), 
where a” < b. The homogeneous equation possesses the solutions y(7) = 


elO+iB)©  yo(x) = e(¢-*9)®, where a = —a and 3 = Vb — a®. The Wronskian 
and its inverse are 
eibe ee ) 


W(x) =e** o + 1B)e* (a —iB)ei8* 

en °® ((—a + iB)e B® eB 
a Casiaye “ee): 
Consequently, we find from (8.41) that 


1 x e'B(2-t) _ e718 (x-t) 
a= Se a(e—t) © 
uia)= 5 f (e . ) F(t) at 


W!(a) = 


(8.43) aoe 
=5 | (co? sin B(x — #)) f(¢) at 


0 
This formula is valid for any function f(t). 
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Cauchy’s Equation 


An equation of the form 


ay 
—1 


(8.44) il i GAY se hae ae ay a0 
x x x 


is usually called “Cauchy’s equation”. Its analytic solution was discussed in full 
detail by Euler (1769, “Sectio Secunda, Caput V”’). Instead of e**, one looks for 
solutions of the form 


(8.45) y(“) =a". 
Example. Consider the problem 
1 1 
(8.46) y’t+-y'- —=y = 0. 
a x 
Inserting (8.45) yields 


(r(r —1) +r—1)a"* =0. 


The roots of this equation are r = 1 and r = —1. Hence, the general solution of 
(8.46) is 

Cc 
(8.47) y(v) = cya + - 


Another possibility for solving (8.44) is the use of the transformation 


(8.48) b=, y(x) = z(t). 
Since 
(8.49) = ie = ee ; zg =...=ay+a7y", 


di de dt 


Eq. (8.46) becomes an equation with constant coefficients 2” — z = 0, to which 
we can apply the above theory (Theorem 8.3). This gives z(t) = cye’ + cpe~*, 
which, after back substitution, becomes (8.47) again. 


Exercises 
8.1 IfA1,...,An are distinct complex numbers, then 
(8.50) ce? + c9e?* +... + cper® =0 
for all x if and only if c¢; = cg =...= cy, = 0. 
Hint. Differentiating Eq. (8.50) at « = 0 shows that baer cA = 0 for 
k = 0,1,.... Consider then the expression $7""_, cjp(Ai), where p(x) is a 


polynomial that vanishes for \1,...,Aj—1,Aj41,-+-,An but not for A;. 


8.2 


8.3 


8.4 


8.5 


8.6 


8.7 
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For distinct values \1,..., An we have 


De G + dja + ci”) or a0 


i=l 


for all x if and only if all coefficients c;, d;, e; vanish. 

Hint. Prove that for an arbitray polynomial we have 

rer (cepa) + dip’ (As) + esp”’(Aa)) = 0. 

A second access to the case of multiple characteristic values (d’ Alembert 
1748). Suppose that \ is a double root of (8.10). Split this root into two neigh- 
boring roots \ and + ¢ (with ¢ infinitely small). In this case, e*", e+*)*, 
and also the linear combination 

eAte)x — er 


y(x) = - 


are solutions of the problem. Show that the latter becomes, for ¢ — 0, the 


solution xe>”. 


Look for a particular solution of y’” + 0.2y’ + y = sin(wa) and study its 
amplitude as function of w. What phenomenon can be observed? 


Compute a particular solution of y” — 2y’ + y = e” cosx 
a) by putting y = Ae” sinz + Be” cosa; 
b) by the method of variation of constants; and 
c) by solving y” — 2y/ + y = elt, 
Solve the following homogeneous and inhomogeneous Cauchy equations: 
gy” _ xy! = 3y = 0, 
xy" . xy! = By = x’, 
xy” — 3ay' + 4y = 0. 
The last equation will lead to a problem of double roots. Meet the situation 


with determination (Laurel & Hardy 1933, The Sons of the Desert). 


Let y:(x) and y2(a) be two solutions of y” + a(a)y’ + b(x)y = 0. Then, 
show that the Wronskian (8.40) satisfies 


det (W(x) = det (W(x0)) -exp(— / * a(t) dt). 


xo 


Hint. Find a differential equation for z(x) = det(W(z)). 
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1.9 Numerical Solution of Differential Equations 


I have always observed that graduate mathematicians and physicists are 
very well acquainted with theoretical results, but have no knowledge of the 
simplest approximate methods. 

(L. Collatz, Num. Beh. Diffgl., Springer 1951, Engl. transl. 1960) 


It is often impossible to solve a differential equation 


(9.1) y = f(x,y) 


by analytic methods (e.g., y/ = x? + y?). If it is possible, it may happen that the 
integrals that appear are not elementary (e.g., y” + siny = 0, see (7.23)). Even 
in the case where all integrals are elementary, the formulas obtained might not be 
useful. For example, the solution of y’ = y* + 1 is given by (see Eq. (5.16)) 

V2, yt+V/+1 V2 

— In — + “= (arctan q V2+1) +arctan(yV2—1 ) =2£+C, 

Ge ae a (y ) (y ) 

which is a rather unpractical formula, especially if we want y as a function of x. 
Therefore, it is interesting to search for numerical methods that treat (9.1) directly. 


Euler’s Method 
PROBLEM 85: Given an arbitrary differential equation, find for its integral 
a close approximation. (Euler 1768, 8650) 


Equation (9.1) prescribes for each point (x, y) a value f(x, y) that is the slope of 
the solution. One can thus imagine a field of directions (Joh. Bernoulli 1694). The 
curves that always follow these directions are the solutions of (9.1). See Fig. 9.1 
for the “Exemplo res patebit” (called Riccati’s equation) 


(9.2) yarty, 


which does not possess an elementary solution (Liouville 1841, “J’ai donc pensé 
qu’il pouvait étre bon de soumettre la question 4 une analyse exacte .. .”). Obvi- 
ously, the solutions are not unique. Therefore, we prescribe an initial value 


(9.3) y(20) = Yo- 


Euler’s Idea (Euler 1768, Sectio Secunda, Caput VII). We choose h > 0 and we 
replace the solution for x9 < x < xp +h by its tangent line 


(x) = yo + (x — x0) - f (Zo, Yo). 


For the point 7; = zo +h this gives yi = yo + hf (xo, yo). At this point we 
compute again the new direction and repeat the above procedure in order to obtain 
the “valores successivi” 


(9.4) Intl =In th, Ynt1 = Yn thf (en; Yn)- 
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FIGURE 9.1. Prescribed slopes for y’ = x” + y’ with four solutions 


This is Euler’s method. The function that is obtained by connecting all these tan- 
gents is called Euler’s polygon. If we let h — 0, these polygons approach the 
solution more and more closely (see Fig. 9.2). 


Numerical Experiment. We consider the differential equation (9.2), choose the ini- 
tial values vo = —1.5, yo = —1.4, and the step sizes h = 1/4, 1/8, 1/16, 1/32. 
The resulting Euler polygons are plotted in Fig. 9.2. The numerical approximation 
and the errors at x = 0 are shown in Table 9.1. We observe that the error decreases 
by a factor of 2 whenever the step size is halved (“quot” denotes the quotient be- 
tween the errors for two successive step sizes). An explanation of this fact can 
be found in any textbook on numerical analysis (e.g., Hairer, Ng@rsett, & Wanner 
1993, Sect. II.3, p. 159). 


TABLE9.1. Euler’s method TABLE 9.2. Method (9.5) 
1/h y(0) error quot 1/h y(0) error quot 

4 0.7246051 -0.6762019 2 -0.7330279 0.7814312 
8 0.2968225 -0.2484192 2.722 4 -0.1063739 0.1547771 5.049 
16 0.1577289 -0.1093256 2.272 8 0.0153874 0.0330159 4.688 
32 0.0999576 -0.0515543 2.121 16 0.0409854 0.0074179 4.451 
64 0.0734660 -0.0250628 2.057 32 0.0466509 0.0017523 4.233 
128 0.0607632 -0.0123599 2.028 64 0.0479776 0.0004257 4.116 
256 0.0545412 -0.0061380 2.014 128 0.0482984 0.0001049 4.058 


512 0.0514618 -0.0030586 2.007 256 0.0483772 0.0000260 4.029 
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FIGURE 9.2. Polygons for y’ = x? + y” FIGURE 9.3. Parabolas of order 2 


Taylor Series Method 


PROBLEM 86: Improve significantly the above method of approximate 

integration of differential equations, so that the result be closer to the truth. 

(Euler 1768, §656) 

We note that (9.4) represents the first two terms of Taylor’s series. In order to 
improve the precision, let us use three terms so that 


h2 
(9.5) Ynt1 = Yn + Yn + > Yn 


We have y/, = f (an, Yn), and for the computation of y/’ we simply differentiate 
the differential equations with respect to x. This gives, for y/ = x? + y?, 


(9.6) y” = 2a + Qyy! = Qe + Qe?y + Qy?. 


The numerical results obtained by (9.5) with h = 1/2, 1/4, 1/8, and 1/16 are 
shown in Fig.9.3. We have replaced the polygons of Euler’s method by “poly- 
parabolas” composed of the truncated Taylor series. The errors at x = O are 
presented in Table 9.2. For small h the results are much better than for Euler’s 
method; halving the step size divides the error by 4. 


Remark. It is of course possible to take additional terms of the Taylor series into 
account, e.g., 


2 
m 


9.7 = pee i 
(9.7) Yn+1 = Yn + PY + op Yn + Bp Yn - 


The higher derivatives are obtained by iterated differentiation of the differential 
equation. For Riccati’s equation we obtain from (9.6) 
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FIGURE9.5. Solutions for the pendulum (9.8’) 
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FIGURE 9.6. Numerical solutions for the pendulum (9.8’) 
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yf” = 24 2y'y! + Qyy” = 24 day + 2x* + 8x7y? + 6y4 
yl" = Ay + 122? + 20xy? + 16x*y + 40ar7y? + 24y°, ete. 


Second-Order Equations 
Consider, for example, the pendulum equation (7.23) 
(9.8) y” =—siny. 


We introduce a new variable for y’ so that (9.8) becomes 
(9.8’) 


This system can be interpreted as a vector field, which prescribes at each point 
(y,v) a velocity of the point (y(x), v(x)) moving with x (Fig.9.4). The solu- 
tions (y(a), v(a)) constantly respect the prescribed velocity. They are sketched in 
Fig. 9.5. The ovals represent the oscillations; the sinusoids are the rotations of a 
pendulum that turns over. 


Euler’s Method. The idea (Cauchy 1824) is to apply Euler’s method (9.4) to both 
functions y(x) and v(x). If y(ao) = yo and v(%o) = vo are given initial values 
and h > 0 is a chosen step size, the analog of (9.4) applied to (9.8’) is 


(9.9) tn41=Un +h, Ynt1 =Yn th-vn, Un+1 = Un — h-sin(yn). 


Fig. 9.6 shows Euler’s polygons for the initial values y(0) = 1.2, v(0) = 0, and 
for h = 0.15. We observe that our tremendous method predicts that the pendulum, 
in contrast to physical reality, accelerates and finally turns over. 
Taylor Series Method. Differentiating (9.8’) with respect to x, we obtain 

i 


(9.10) y” =v' =—siny, v” =—cosy-y! = —cosy-v, 


which allow us to use an additional term of the Taylor series. The analog of 
Eq. (9.5) becomes 


h? h? 
Yn+1 = Yn 7 hy), Te oa Yn = Yn + hon — = sin yn 
(9.11) 2 2 
; h? h? 
41 = Un + ho}, + — ul! = un — hsin(yn) — — cos Yn + Un- 
Un+t1 = Un Un 5 Un = Un sin(Yn 5 Yn + Un 


The results (see Fig. 9.6 to the right) are much better even for h twice as large. 
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Exercises 


9.1 


9.2 


9.3 


Apply the method of Euler with h = 1/N to the equation 


y=drAy, — -y(0)=1 
in order to obtain an approximation of y(1) = e>. The result is a well-known 
formula of Chap. I. 


(Inverse Error Function). Define a function y(z) by the relation 


2 is Pp 
———— e dt. 
VT Jo 


Differentiate this formula and show that y(z) satisfies the differential equa- 
tion 


Compute the first four terms of the Taylor series for y(x) (developed at the 
point x = 0). 

(Van der Pol’s Equation). Compute y“ and 
v© for i = 1,2,3 for the solutions of the 
differential equation 


‘= e(1—-y?)v —Y, 


and compute numerically the solution us- 
ing the third-order Taylor series method 
for « = 0.3, the initial values y(0) = 
2.00092238555422, v(0) = 0, and for 0 < 
x < 6.31844320345412. The correct so- 
lution is periodic for this interval and the 
given initial values. 
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1.10 The Euler-Maclaurin Summation Formula 


The King calls me “my Professor”, and I am the happiest man in the world! 
(Buler i is proud to serve Frederick II in Berlin) 


I have here a geometer who is a big cyclops ... who has only one eye left, 
and a new curve, which he is presently computing, could render him totally 
blind. (Frederick II; see Spiess 1929, p. 165-166.) 


This formula was developed independently by Euler (1736) and Maclaurin (1742) 
= a eae pol for the computation of sums such as the harmonic sum 1 + 
5 s+ +. =, the sum of logarithms In2 + In3+In4+...+Inn = Inn, 
the sum oF over 1h * 2k 4+ 3* +... +n*, or the sum of reciprocal powers 
1+ a + = +e. + a with the els of differential calculus. 


Problem. For a given function f(x), find a formula for 


(10.1) S=f0)+fQ+fB)+-..+fM=D LO 
(“investigatio summae serierum ex termino generali’). 


Euler’s Derivation of the Formula 


The first idea (see Euler 1755, pars posterior, § 105, Maclaurin 1742, Book II, 
Chap. IV, p. 663f) is to consider also the sum with shifted arguments 


(10.2) s=f(0)+f(l)+f(2)+...+f(n—1). 
We compute the difference S—s using Taylor’s series (Eq. (2.8) with r—x9 = —1) 


re el ee ON 


f@-l)-fad=- 1! 2! 3! 


and find 
= LF = asl) a af" = ww fo 


In order to turn this formula for 5> f’(i) into a formula for }> f (i), we replace f 
by its primitive (again denoted by /): 
(10. 3) 


Yo 10) = [ tear ZO rO-FL"O+grsro- 


The second idea is to remove the sums )> f’, )> f”, >> f’”, on the right by using 
the same formula, with f successively replaced by f’, f”, f’” etc. This will lead 
to a formula of the type 
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n 


Aw =f Ha) ae—a( s(n) - 0) + 0(F') - £10) 
= HF" (n) = £0) + 5(F%(n) — (0) = 


For the computation of the coefficients a, 3, 7,... we successively replace f in 
(10.4) by f’, f”,... to obtain 


(10.4) 


CFO = IP fede -a(F(n) - FO) +80" (r) = £0) - 
-ED'@) = -H(f(n) = FO) +414") = f'(0) - 
srr = +4(f"(n) — #0) - 


The sum of all this, by (10.3), has to be ile f(x) dx. Therefore, we obtain 


1 a 1 Boa 1 


from which we can compute a = —4, B > y = 0,6 —L... and we 


have 


(10.6)} = 0 12 


(10.1) Example. This formula, applied to a sum of nearly a million terms, 


a 1 i 
So Pee he 0°} IO 2 10 
it tS ¢ ia) t Toong0g ee EO 9g 
1 1 1 
— — 1074+ ——1078 +...211.4 4 
+ ag ~ pgp lt alot 63758469, 


gives an excellent approximation of the exact result by a couple of terms only. The 
formula is, however, of no use for the computation of the first terms 1+3+. eer a a 


Bernoulli Numbers. It is customary to replace the coefficients a, 3, y,... by 
B;/i! (Bo = 1, a = B,/1!, BG = Bo. /2!,...), so that (10.5) becomes 


k-1 
k 
(10.5) 2B,+Bo=0, 3B2:+3B,+Bo=0, ..., dS (*\s, =0. 
1=0 


The Bernoulli numbers, as far as Euler calculated them, are 


162 II. Differential and Integral Calculus 


1 1 1 1 1 
Bo=1, B=--, Bo=-, Ba=-—, Be=— B=-— 
0 ’ 1 2? 2, 6” 4 30’ 6 42” 8 30’ 
5 691 7 3617 43867 
Bio =—, Be=-— =-, By =-— — Sonor 
10 66° 12 2730’ 14 6’ 16 510 ) 18 798 ) 
=. 174611 — 854513 = _ 236364091 
amen i or OG. —— a0. 
_ 8553103 _ __ 23749461029 _ 8615841276005 
26 — 6 ’ 28 — 870 ’ 30 — 14322 ’ 
and Bs = Bs =... = 0. In this notation, Eq. (10.6) becomes 
S f= (f(a) de + 5(F() ~ £0) 
(10.6’) + 


ai (Ft Pm) — £010). 


se 
k>1 


Example. For f(a) = «% the series of Eq. (10.6’) is finite and gives the well-known 
formula of Jac. Bernoulli (1.1.28), (1.1.29). 


Generating Function. In order to get more insight into the Bernoulli numbers, 
we apply one of Euler’s great ideas: consider the function V(u) whose Taylor 
coefficients are the numbers under consideration, i.e., define 


V(u) =1l+au+ Bu? + yu? 4 but t+... 


(10.7) B B B B 
ep ee a ee i aa 


Now the formulas (10.5) alias (10.5’) say simply that 


uw wu 
Vu) + 5+ 5+ 54.) =u 
that is, 
U 
10. S-_.. 
(10.8) VG) 


Thus, the infinitely many algebraic equations become one analytic formula. The 
fact that 
u te ah, ee ee? 


u 
10.9 V —= ee fal te Ee, 
( ) (is) + 2 a il 2 2 ev/2—e-u/2 


is an even function shows that Bz = Bs = By =...=0. 
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De Usu Legitimo Formulae Summatoriae Maclaurinianae 


We now insert f(a) = cos(27x), for which f(¢) = 1 for all 2, into Eq. (10.6’). 
This gives 1+ 1+ ...+ 1 to the left, and0+0+0+... to the right, because 
cos(27a) together with all its derivatives is periodic with period 1. We see that the 
formula as it stands is wrong! Another problem is that for most functions f the 
infinite series in (10.6’) usually does not converge. 

It is therefore necessary to truncate the formula after a finite number of terms 
and to obtain an expression for the remainder. This was done in beautiful Latin 
(see above) by Jacobi (1834) by rearranging Euler’s proof using the error term 
(4.32) of Bernoulli-Cauchy throughout. It was later discovered (Wirtinger 1902) 
that the proof can be done simply by repeated integration by parts in a similar 
manner to the proof of Eq. (4.32). The main ingredient of the proof is the so-called 
Bernoulli polynomials. 


Bernoulli Polynomials. The polynomials 


By(ax) = Bor + By ="2-4% 

Bo(x) = Box? +2Byx + Bg ee i 
B3(x) = Box? + 3By2? + 3Box + B3 =r? —3e? +h 
Ba(x) = Boxr* + 4Bi2° + 6Bor? +4B32+ By =2*- 223427 - a 


or, in general, 


k 
(10.10) By(x) = >> (*) Byx*-*, 
satisfy 
(10.11) Bg(x) =KBy-i(x),  By(0) = Be(1) = Be (k= 2). 


Indeed, the first formula of (10.11) is a property of the binomial coefficients (see 
Theorem I.2.1); the second formula follows from the definition and from (10.5’). 


(10.2) Theorem. We have 


i=1 
k 
+ EE (f(a) = £90) + Be 
j=2 : 
where 
_ _4)k-1 pr 
(10.12) R, = | By (x) f (a) da 


Here, B,(x) is equal to B,(x) for 0 < x < 1 and extended periodically with 
period 1 (see Fig. 10.1). 
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FIGURE 10.1. Bernoulli polynomials 


Proof. We start by proving the statement for n = 1. Using B}(x) = 1 and inte- 
grating by parts we have 


1 


i f(x Jar = f° Bi Bi (x) f(x) dx = Bi(x) f(x x) — [Balai dr 


0 


The first term is 5(f(1) + f(0)). In the second term we insert from (10.11) 
By (x) = 4.B%(x) and integrate once again. This gives 


1 

if. fla) de = 5( 4) +£0)) — F¢(F'C)— 10) +5 f Balar"(a) ae 
or, continuing like this, 
(10.13) 
1 (-1) (G-1) 
s(s+s0) = fe) ye OE Bi (f9-W (a) — 0-D(0)) + Re. 
with 

_4)k-1 pl 
(10.14) n= | By (a) f™ (a) de. 


We next apply Eq. (10.14) to the shifted functions f(a + i — 1), observe that 


| Bu(o)f (a +e— Va = f B(x) f™ (x) de, 
0 4-1 


and obtain the statement of Theorem 10.2 by summing these formulas from 72 = 1 
tov=n. 
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Estimating the Remainder. The estimates (for 0 < x < 1) 


1 1 V3 1 
B <a B <s B B <= 
Bilal <5,  |Ba@<z, Bo) <XS, Baal < ss, 
which are easy to check, and the fact that | So 9 x) dx| < i |g(a)| dx, show that 


(ois) |l<5f i@lae, Meals f lr" @lae 
0 0 


These are the desired rigorous estimates of the remainder of Euler-Maclaurin’s 
summation formula. Further maximal and minimal values of the Bernoulli poly- 
nomials have been computed by Lehmer (1940); see Exercise 10.3. 


(10.3) Remark. If we apply the formula of Theorem 10.2 to the function f(t) = 
hg(a + th) with h = (b — a)/n and if we pass the term (f(n) — f(0)) /2 to the 
left side, we obtain (with x; = a+ ih) 


n-1 b 
5 9(t0) +h YS gles) +5 9(en) = fo gle) ae 


k 
(10.16) ~ eT B;(g%0(b) — 9 Y(a)) 


Akt+i ne 
ae | B,(t)g (a + th) dt, 
: 0 


where we recognize on the left the trapezoidal rule. Equation (10.16) shows that 
the dominating term of the error is (h?/12) (g’(b) — g'(a)). However, if g is peri- 
odic, then all terms in the Euler-Maclaurin series disappear and the error is equal 
to R, for an arbitrary k; this explains the surprisingly good results of Table 6.2 
(Sect. II.6). 


Stirling’s Formula 


We put f(a) = Inz in the Euler-Maclaurin formula. Since 


f@) =m24+1In34+1n44+n5+4+...4+Inn=In(n!), 
i=2 


we will obtain an approximate expression for the factorials n! = 1-2-...-n. 
(10.4) Theorem (Stirling 1730). We have 


2 tt 1 1 1 ~ 
v2nn n™ i eee) 


10.17) n!= (= -aG 
CO em en ©XP\ Ton  360n3 | 1260n®  1680n" 
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where |Rg| < 0.0006605/n®. This gives, for n — 00, the approximation 


V2mm n” 


e” 


(10.18) nl & 


Remark. This famous formula is especially useful in combinatorial analysis, statis- 
tics, and probability theory. Equation (10.17) is truncated after the 4th term simply 
because one additional term would not fit into the same line. 

The numerical values of (10.18) and (10.17) (with one, two and three terms) 
for n = 10 and n = 100 are compared to n! in Table 10.1. 


TABLE 10.1. Factorial function and approximations by Stirling’s formula 


n= 10: Stirling 0 = 0.359869561874103592162317593283 - 10’ 
Stirling 1 = 0.362881005142693352994116531675- 107 

Stirling 2 = 0.362879997141301292538591223941 - 10” 

Stirling 3 = 0.362880000021301281279077612862- 107 

n! = 0.362880000000000000000000000000 - 107 


n=100: Stirling 0 = 0.932484762526934324776475612718- 10'°° 
Stirling 1 = 0.933262157031762340989619195146- 101°8 

Stirling 2 = 0.933262154439867463946383356624- 10'°8 

Stirling 3 = 0.933262154439441582371338864918- 10'°8 

n! = 0.933262154439441526816992388563 - 10'°8 


Proof. We have seen above (Example 10.1) that the Euler-Maclaurin formula is 
inefficient if the higher derivatives of f(a) become large on the considered inter- 
val. We therefore apply the formula with f(a) = In x for the sum fromi = n+ 1 
to 7 = m. Since 


: )} — 1)! 
fincas =xzlnz—-2, — (Inz) = (-1)/"? 0 De 
oy 


we obtain from Theorem 10.2 that 
S- f(@) =Inm! - Inn! = minm—m-— (nlnn—n) + 5 (nm —Inn) 
i=n+l1 
1 yl 1 1 1 1 ~ 

ou (4-1) gh (eB) +m 
Gui +79 \m~ n/~ 360 Gn na) t 
where |R5| < 0.00123/n* for all m > n. This estimate is obtained from (10.12) 
and (10.15) and the fact that |Bs(x)| < 0.02446 forO0 < x < 1. In (10.19), 
the terms Inn!, nlnn, n, and (1/2) Inn diverge individually form — oo. We 
therefore take them together and set 
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1 
(10.20) Yn =Inn!+n— (n+ 5) Inn, 
and (10.19) becomes 


1 sl 1 1 1 1 ~ 
10.21 sp esas = (=-=)-s5(4-=3) - 
Wey Ye =e TT im) — 360 \ne m3) 
For n and m sufficiently large +,, and y,, become arbitrarily close. Therefore, it 
appears that the values 7,,, converge, form — oo, to a value that we denote by 7 
(the precise proof will be given in Theorem III.1.8 of Cauchy). We then take the 
limit m — oo in Eq. (10.21) and obtain 


Inn! +n—(n+5)Inn= + : _ : +R 
D Ton B60 


where |R5| < 0.00123/n*. Taking the exponential function of this expression we 
get 


(10.22) n! =D, 


ene 1 es 
th Dy =e: (= ee : 
en “a e'vexP on 360n2 


This proves (10.18) and also (10.17), as soon as we have seen that the limit of D, 
(i.e., D = e”) is actually equal to 27. To this end, we compute, from (10.22), 
Dy eDy - abe nhs (Qn) se Fan... 


2 
Do, 2M e-2M.H- (Qn)! 1-38 - 


which tends to D too. This formula reminds us of Wallis’s product of Eq. (1.5.27). 
Indeed, its square, 
(2 2)" = 2:2-4-4-6-6 --- (2n)(2n) 2(2n +1) 
Don 1-3-3-5-5-7 +--+ (Qn—1)(2n+1) n 


— 1/2 —4 


tends to 27, so that D = \/2z. The stated estimate for Ro follows from (10.12) 
and | Bg(x)| < 0.04756. 


The Harmonic Series and Euler’s Constant 


We try to compute 
it Sh aes 
De LS Pe 
by putting f(x) = 1/a in Theorem 10.2. Since f“) (a) = (—1)/j!a-4—!, we get, 
instead of (10.19), 
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ath ee eee Fis lb. fracccc : ait ee ee ee 3 


es. 7 D iy 
et dhermatibe> of me sie ise oe 
Ha ci Jeo reat 1+ bt hte He es ++ 
x A 
Cee dak ~aaaxt Sie PE ER ERE ech gieee ape 
ee a on 
+ a. ae inet 


| Hide ease! ae. sy eye eae 
: erga ad scares a 
Deep traér aa bs ; 

Vora 


’ Phat + 1 3) 
qies oe Im n/ 12\m2 n2 
(10.23) itnt1 a ee 2\m n 12\m n 
+ (= -,) =(< =) tops =) +h 
120\m4* nt) 252 240 ns i 
where, because of | Bg(a)| < 0.04756, we have |Ro| < 0.00529/n9. The diverg- 
ing terms to collect will now be, instead of (10.20), 
a - : —lInn 
aa ei 
which is investigated precisely as above and seen to converge. This time, the con- 
stant obtained, 
1 1 1 
(10.24) 14 3 3 t...¢——Inn — 7 = 0.57721566490153286... , 
n 
is a new constant in mathematics and is called “Euler’s constant” (see Fig. 10.2 
for an autograph of Euler containing his constant and its use for the computation 
of the sum of Example 10.1). Letting, as before, m — oo in (10.23), we obtain 


| 1 i 1 i 1 
10.25 -=7+1 Ra = Se eon Ro, 
(10.25) D7 =7+nn+ >> — oat t Toot ~ Dane + Tons + Ae 
where |Ro| < 0.00529/n°. To find the constant 7, we put, for example, n = 10 
(as did Euler) in Eq. (10.25) and obtain the value of (10.24). This constant was 
computed with great precision by D. Knuth (1962). It is still not known whether it 
is rational or irrational. 


' Reproduced with permission of Birkhaeuser Verlag, Basel. 
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Exercises 


10.1 The spiral of Theodorus is composed of rectangular triangles of sides 1, ,/n, 
and \/n + 1. It performs a complete rotation after 17 triangles (this seems to 
be the reason why Theodorus did not consider roots beyond 17 ). No longer 
prevented by such scruples, we now want to know how many rotations a 
billion such triangles perform. This requires the calculation of (see Fig. 10.3) 


1000000000 


1 
1+ — arctan —= 


with an error smaller than 1. This exercise is not only a further occasion to 
admire the power of the Euler-Maclaurin formula, but also leaves us with an 
interesting integral to evaluate. 


FIGURE 10.3. The spiral of Theodorus of Cyrene, 470-390 B.C. 


10.2 (Formula for the Taylor series of tan). If we let cota = 1/tanz and 
cothx = 1/ tanh, Eq. (10.9) can be seen to represent the Taylor series of 
(2/2) coth(x/2). This allows us to obtain the series expansion of x - cothz, 
and, by letting x +> iz, that of x - cot x. Finally, use the formula 


2-cot 2x = cotx — tanz 


and obtain the coefficients of the expansion of tana. Compare it with 
Eq. (1.4.18). 


10.3 Verify numerically the estimates (Lehmer 1940) 
|B3(x)| < 0.04812, |Bs(x)| < 0.02446, |Bz(x)| < 0.02607, 
|Bo(x)| < 0.04756, |Bii(x)| < 0.13250, —- | By3(a) | < 0.52357 


forO<a2< 1. 


Il 


Foundations of Classical Analysis 


... Lam not sure that I shall still do geometry ten years from now. I also 
think that the mine is already almost too deep, and must sooner or later be 
abandoned. Today, Physics and Chemistry offer more brilliant discoveries 
and which are easier to exploit... 
(Lagrange, Sept. 21, 1781, Letter to d’Alembert, Oeuvres, vol. 13, p. 368) 
Euler’s death in 1783 was followed by a period of stagnation in mathematics. He 
had indeed solved everything: an unsurpassed treatment of infinite and differential 
calculus (Euler 1748, 1755), solvable integrals solved, solvable differential equa- 
tions solved (Euler 1768, 1769), the secrets of liquids (Euler 1755b), of mechan- 
ics (Euler 1736b, Lagrange 1788), of variational calculus (Euler 1744), of algebra 
(Euler 1770), unveiled. It seemed that no other task remained than to study about 
30,000 pages of Euler’s work. 

The “Théorie des fonctions analytiques” by Lagrange (1797), “freed from 
all considerations of infinitely small quantities, vanishing quantities, limits and 
fluxions”, the thesis of Gauss (1799) on the “Fundamental Theorem of Algebra” 
and the study of the convergence of the hypergeometric series (Gauss 1812) mark 
the beginning of a new era. 

Bolzano points out that Gauss’s first proof is lacking in rigor; he then gives 
in 1817 a “purely analytic proof of the theorem, that between two values which 
produce opposite signs, there exists at least one root of the equation” (Theorem 
II.3.5 below). In 1821, Cauchy establishes new requirements of rigor in his fa- 
mous “Cours d’ Analyse”. The questions are the following: 


— What is a derivative really? Answer: a limit. 
— What is an integral really? Answer: a limit. 
— What is an infinite series a] + a2 + a3 +... really? Answer: a limit. 
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This leads to 
— What is a limit? Answer: a number. 


And, finally, the last question: 
— What is a number? 

Weierstrass and his collaborators (Heine, Cantor), as well as Méray, answer 
that question around 1870-1872. They also fill many gaps in Cauchy’s proofs 
by clarifying the notions of uniform convergence (see picture below), uniform 
continuity, the term by term integration of infinite series, and the term by term 
differentiation of infinite series. 

Sections III.5, HI.6, and III.7, on, respectively, the integral calculus, the dif- 
ferential calculus, and infinite power series, will be the heart of this chapter. The 
preparatory Sections III.1 through III.4 will enable us to build our theories on a 
solid foundation. Section III.8 completes the integral calculus and Section III.9 
presents two results of Weierstrass on continuous functions that were both spec- 
tacular discoveries of the epoch. 
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Weierstrass explains uniform convergence to Cauchy 
who meditates over Abel’s counterexample 
(Drawing by K. Wanner) 
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III.1 Infinite Sequences and Real Numbers 


If, for every positive integer n, we have given a number s,,, then we speak of an 
(infinite) sequence and we write 


(1.1) {Sn} = {81, $2, 83, $4, 85,...}. 


The number s,, is called the nth term or the general term of the sequence. 
A first example is 


(1.2) {1,2,3,4,5,6,...}, 


which is an arithmetic progression. This means that the difference of two succes- 
sive terms is constant. The sequence 


(1.3) Qe ga Os) 


is a geometric progression (the quotient of two successive terms is constant). 


Convergence of a Sequence 


One says that a quantity is the limit of another quantity, if the second ap- 
proaches the first closer than any given quantity, however small ... 
(D’ Alembert 1765, Encyclopédie, tome neuvieme, a Neufchastel.) 


When a variable quantity converges towards a fixed limit, it is often useful 
to indicate this limit by a specific notation, which we shall do by setting the 
abbreviation 
lim 
in front of the variable in question ... 
(Cauchy 1821, Cours d’Analyse) 

If the terms s,, of a sequence (1.1) approach arbitrarily closely a number s for n 

large enough, we call this number the Jimit of (1.1). This concept is very important 

and calls for more precision: 

— “arbitrarily closely” means “closer than any positive number €”’, i.e., |S, —s| < 
é. Here, | - | is the absolute value and forces s,, to be close to s in the positive 
and the negative direction. 

— “for n large enough” means that there must be an N such that the above esti- 
mate is true for alln > N. 

With the symbols V (“for all”) and 4 (“there exists”), we can thus express the 

above situation in the following compact form. 


(1.1) Definition (D’ Alembert 1765, Cauchy 1821). We say that a sequence (1.1) 
converges if there exists a number s such that 


a) Ve>0 


N>1V¥n>N |s,—8| <e. 


We then write 
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FIGURE 1.1. Convergence of the sequence (1.6) 


(1.5) s= lim sy, or Sn — 8. 


If (1.4) is not true for any s, the sequence (1.1) is said to diverge. 


(1.2) Examples. Consider the sequence 


123 4 5 n 
{ 5. =-—55 7; soa where Sn = 
2.3 4 5 6 


This sequence converges to 1, because 


1 
n+1 


n 
n+1 


SE 


|S, —1| = | — | = 

for 1/(n + 1) < e, hence forn > 1/e — 1. Therefore, for a given ¢ > 0, we can 

take for N an integer that is larger than 1 /¢ — 1 and condition (1.4) is verified. 
As the next example, we choose the sequence 


1 1 1 

s,=1, ae aes ses ai 
(1.6) 1 1 1 n l 
ec 7 1= Ennai 2 
ai an eae mS 2S ) j 


(here [2/2] denotes the largest integer & not exceeding 7/2; i.e., [¢/2] = kifi = 
2k or i = 2k + 1). This sequence is somewhat less trivial and is illustrated in 
Fig. 1.1. It seems to converge to a number close to 1.13 (which we guess, after 
our experience of Chap.I, to be 7/4 + In2/2). We observe that for a given ¢ 
(here « = 0.058), there is a last s,, (here sig) violating |s,, — s| < e. Hence, 
for N = 17, (1.4) is satisfied. The fact that several earlier terms (s3, 55,...) also 
satisfy this estimate does not contradict (1.4). 
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(1.3) Theorem. [fa sequence {s,,} converges, then it is bounded, i.e., 


(1.7) JB Vn>1 |s,|<B. 


Proof. We put ¢ = 1. By the definition of convergence, we know the existence of 
an integer N such that |s,, — s| < 1 forall n > N. Using the triangle inequality 
(see Exercise 1.1), we obtain |s,,| = |s,—s+s| < |s,—s|+|s| << 1+|s|forn > N 
and the statement is proved with B = max {|sj|, |s2|,-.-,|s~—1|,|s| + 1}- 


For the boundedness of a sequence it is not necessary that it converge. For 
example, the sequence 


(1.8) Sah 4 dy Oty Oya OS LO, ce) 


is bounded (with B = 1) but does not converge. 
The sequence (1.2) is neither bounded nor does it converge. The general 
arithmetic progression 


(1.9) fap, bd 2a) Bady Ady ia, est 


is also unbounded (for d 4 0). For d > 0 this sequence satisfies 


(1.10) VM>0 SN>1Vn>N 5,>M. 


To see this, take an integer N satisfying N > M/d. If (1.10) is verified, we say 
that the sequence {s,,} tends to infinity and we write 


lim s, = 0o or Sn — OO. 
n— oo 
In a similar way, one can define lim,... sn = —co. We next investigate the 


convergence of sequence (1.3). 


(1.4) Lemma. For the geometric progression (1.3), we have 


0 for |q| <1, 
lim qg”’=4¢1 = forq=1, 
ae oo for g>t. 


The sequence (1.3) diverges for q < —1. 


Proof. Let us start with the case g > 1. We write g = 1+r (with r > 0) and apply 
Theorem I.2.1 to obtain 
-1 
g’o=(14+r)"=l+nrt+ MoD ey. >1+nr. 

Therefore, the terms gq” tend to infinity (for a given M choose N > M/r in 
(1.10)). The statement is trivial for g = 1. 

For |g| < 1 we consider the sequence s,, = (1/|q|)”, which tends to infinity 
by the above considerations. For a given e > 0 we put M = 1/e and apply (1.10) 
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to the sequence {s,,}. This proves the existence of an integer N such that for all 
n > N we have s, > M or equivalently |g”| < ¢. This proves that g” — 0. For 
q = —1 the sequence oscillates between —1 and 1 and for g < —1 itis unbounded 
and oscillating. 


The following theorem simplifies the computation of limits. 


(1.5) Theorem. Consider two convergent sequences 8, — s and Vj, — v. Then, 
the sum, the product, and the quotient of the two sequences, taken term by term, 
converge as well, and we have 


(1.11) lim (S$, + Un) =s+v 

(1.12) lim (Sn +Un) = s-v 

(1.13) lim (=) == if um#0 and v <0. 
n—-Co Un UV 


Proof. We begin with the proof of (1.11). We estimate 


\(Sn +n) — (8 +¥)| =|8n — $8 +n — v| < |[8n — 8| + lun —v| < 2e =e’ 
See 
Se aE 


by the triangle inequality. For the proof to be logical this sequence of formulas 
has to be read from back to front: given <’ > 0 arbitrarily small, we choose ¢ > 0 
such that 2¢ = ¢’. By hypothesis, the two sequences {s,,} and {v,,} converge to s 
and v. This means that there exist N; and N2 such that |s,, — s| < ¢forn > N; 
and |v, — v| < eforn > No. If we choose N = max(Nj, No), we see that (1.4) 
is satisfied for the sequence {s,, + v,,}. Once we are accustomed to this argument, 
repeating these explanations will not be necessary. 

For the proof of (1.12) we have to estimate s,,v,— sv. Let us add and subtract 
“mixed products” —svuy, + svn such that 


|SnUn — 80| = |SnUpn — SUn + SUn — 50| 


S |Un| - |8n — 8] + |8|- len — vo] < (B+|s|)e =e’. 


Here, we have used Theorem 1.3 for the sequence {v», }. 

It is sufficient to prove (1.13) for the special case where s,, = 1 for all n, and 
hence s = 1. The general result will then follow from (1.12) because s,,/v,, is the 
product of (1/v,,) and s,,. We first observe that the values of |v,,| cannot become 
arbitrarily small. Indeed, if we put ¢ = |v|/2 in the definition of convergence, we 
obtain jun — v| < |v|/2 (and hence also |v,,| > |v|/2) for sufficiently large n. 
With this estimate, we now obtain 


1 1 


Un UV 


| 2|Un — v| de ! 


dente ~ uP Jo? 
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(1.6) Theorem. Assume that a sequence {s,,} converges to s and that 8, < B for 
all sufficiently large n. Then, the limit also satisfies s < B. 


Proof. We shall show that s > B leads to a contradiction. For this we put « = 
s — B > 0 and use (1.4). This implies that for sufficiently large n we have 


8 — $n <|8,—s|<e=s5-—B, 


so that s,, > B, which is in contradiction to our assumption. 


Remark. The analogous result for strict inequalities (s,, < B for all n implies 
s < B) is wrong. This is seen by the counterexample s,, = n/(n + 1) < 1 with 
limy-465-S_, = 1. 


Cauchy Sequences. Let us now tackle an important problem. The definition of 
convergence (1.4) forces us to estimate |s,, — s|; the limit s has to be known. But 
what can we do if the limit s is unknown, or, as in Example (1.6), is not known to 
arbitrary precision? It is then impossible to estimate with rigor |s — s,,| < € for 
any € > 0. To bypass this obstacle, Cauchy had the idea of replacing |s,, — s| < € 
in (1.4) by |sy — Sn4%| < € for all the successors 8n+4% Of Sn. 


(1.7) Definition. A sequence {s,,} is a Cauchy sequence if 


(1.14) We>0 


N>1 Vn >N VR>1  |8n — Sn4n| < €. 


FIGURE 1.2. Sequence (1.6) as a Cauchy sequence 


Example. Fig. 1.2 illustrates condition (1.14) for the sequence (1.6). We see that, 
e.g., for ¢ = 0.11 condition (1.14) is satisfied for n > 17. Similarly, it is also seen 
that (1.14) is true for any « > 0, because 1/(n + 2) + 1/(n + 3) tends to zero. 


(1.8) Theorem (Cauchy 1821). A sequence {s,,} of real numbers is convergent 
(with a real number as limit) if and only if it is a Cauchy sequence. 
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It is an immediate consequence of |S, — Sn4x| < [Sn — $| + |S — Snake] < 
2e that convergent sequences must be Cauchy sequences. A rigorous proof of 
the converse implication, beyond Cauchy’s intuition, is only possible after having 
understood the concept of irrational and real numbers. In contrast to the results 
obtained until now (Theorems 1.3, 1.5, and 1.6), Theorem 1.8 is not true in the 
setting of rational numbers. Consider, for example, the sequence 


(1.15) (is TA, Paty Dale . DA1A9 > Wai; ls 


It is indeed a Cauchy sequence (we have |8, — Sn+%| < 10-"+1), but its limit J/2 
is not rational. 


Construction of Real Numbers 


The more I meditate on the principles of the theory of functions — and I 

do this unremittingly — the stronger becomes my conviction that the foun- 

dations upon which these must be built are the truths of Algebra .. . 
(Weierstrass 1875, Werke, vol. 2, p. 235) 


Please forget everything you have learned in school; for you haven’t learned 
it.... My daughters have been studying (chemistry) for several semesters 
already, think they have learned differential and integral calculus in school, 
and even today don’t know why x- y = y- x is true. 

(Landau 1930, Engl. transl. 1945) 


V3 is thus only a symbol for a number which has yet to be found, but is not 
its definition. This definition is, however, satisfactorily given by my method 
as, say 


CF AIS D788 a 


(G. Cantor 1889) 


... the definition of irrational numbers, on which geometric representa- 
tions have often had a confusing influence. ... I take in my definition a 
purely formal point of view, calling some given symbols numbers, so that 
the existence of these numbers is beyond doubt. (Heine 1872) 


At that point, my sense of dissatisfaction was so strong that I firmly re- 
solved to start thinking until I should find a purely arithmetic and abso- 
lutely rigorous foundation of the principles of infinitesimal analysis. ... I 
achieved this goal on November 24th, 1858, ... but I could not really de- 
cide upon a proper publication, because, firstly, the subject is not easy to 
present, and, secondly, the material is not very fruitful. 

(Dedekind 1872) 


Demeaning Analysis to a mere game with symbols ... 

(Du Bois-Reymond, Allgemeine Funktionentheorie, Tiibingen 1882) 
For many decades nobody knew how irrational numbers should be put into a rig- 
orous mathematical setting, how to grasp correctly what should be the “ultimate 
term” of a Cauchy sequence such as (1.15). This “Gordian knot” was finally re- 
solved independently by Cantor (1872), Heine (1872), Méray (1872) (and simi- 
larly by Dedekind 1872) by the following audacious idea: the whole Cauchy se- 
quence is declared “to be” the real number in question (see quotations). This 
means that we associate to a Cauchy sequence of rational numbers s,, (henceforth 
called a rational Cauchy sequence) a real number. 
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This seems to resolve Theorem 1.8 in an elegant manner. But there remains 
much to do: we shall have to identify different rational Cauchy sequences that 
represent the same real number, define algebraic and order relations for these new 
objects, and finally we shall find the proof of Theorem 1.8 more complicated than 
we might have thought, because the terms s,, in (1.14) may now themselves be 
real numbers, i.e., rational Cauchy sequences. All these details have been worked 
out in full detail by Landau (1930) in a famous book, where he admits himself that 
many parts are “eine langweilige Miihe”. 


Equivalence Relation. Suppose that 


V2 is associated to {1.4;1.41;1.414;...} 
V3 is associated to {1.7 ;1.73 ;1.732;...}, 


then J/2 : V3 should be associated to the sequence of the products 
{2.38 ; 2.4393 ; 2.449048, ...}. 


On the other hand, \/6 is also associated to {2.4 ; 2.44 ; 2.449 ;...}. So we have 
to identify the two sequences. 

Two rational Cauchy sequences {s,,} and {v,,} are called equivalent, if 
limp—oo(Sn — Un) = 0, ie., if 


(1.16) Ve>0 IN>1 VWn>N |5n— | <e. 


We then write {s,} ~ {v,}. It is not difficult to check that (1.16) defines an 
equivalence relation on the set of all rational Cauchy sequences. This means that 
we have 


{sn} ~ {sn} (reflexive) 
{sn} ~ {un} => {vn} ~ {sn} (symmetric) 
{sn} ~ {un}, {un} ~ {un} = > {sn} ~ {wn} (transitive). 


Therefore, it is possible to partition the set of rational Cauchy sequences into 
equivalence classes, 


{sn} = {fon} | {vy} is a rational Cauchy sequence and {up} ~ {sn}}, 
Elements of equivalence classes are called representatives. 


(1.9) Definition. Real numbers are equivalence classes of rational Cauchy se- 
quences, i.é., 


R= {Ten} | {Sn} is a rational Cauchy sequence I 


The set Q of rational numbers can be interpreted as a subset of R in the 
following way: if r is an element of Q (abbreviated: r € Q), then the constant 
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sequence {r,7,r,...} is a rational Cauchy sequence. Hence, we identify the ratio- 
nal number r with the real number {r,r,...}. 


Addition and Multiplication. In order to be able to work with R, we have to 
define the usual operations. Let s = {s,,} and v = {v,,} be two real numbers. We 
then define their sum (difference), product (quotient) by 


(1.17) stu:= {8, + vp}, $°U:= {Sn ° Un}. 


We have to take some care with this definition. First of all, we have to ensure that 
the sequences {5,, + Uy} and {s,-Un } are rational Cauchy sequences (this follows 
from |(8n + Un) — (Sn4z + Un+k)| < [Sn — Sn+e| + lun — Un+x| for the sum 
and is obtained as in the proof of Theorem 1.5 for the product). Then, we have 
to prove that (1.17) is well-defined. If we choose different representatives of the 
equivalence classes s and v, say {s/,} and {v/,}, then the result s + uv has to be 
the same. For this we have to prove that s,, — si, > 0 and v, — v}, — 0 imply 
(Sn + Un) — (8), + u),) — Oand (sp + Un) — (8), - v},) — 0. But this is obtained 
exactly as in the proof of Theorem 1.5. 

In a next step, we have to verify the known rules of computation with 
real numbers (commutativity, associativity, distributivity). Here begins Landau’s 
“langweilige Miihe”. We omit these details and refer the reader either to Landau’s 
marvelous book or to any introductory algebra text. 


Order. Let s = {s,,} and v = {v,} be two real numbers. We then define 


(1.18) s<u:e > Je’>0 IM>1 Vm>M sy <ium—e, 
; Suis s<vors=v 


(here the number ¢’ has to be rational in order to avoid an ambiguous definition). 
The rather complicated definition of s < v means that for sufficiently large m 
the elements s,,, and v,, have to be well separated. It also implies that the re- 
lation is well defined. Obviously, it is not sufficient to require s,, < Um (the 
sequences {1,1/2,1/3,1/4,...} and {0,0,0,...} both represent the real number 
0 and serve as a counterexample). 

The relation s < v of (1.18) defines an order relation. This means that 


s<s _ (reflexive) 

s<vuvu<w = s<w (transitive) 

s<vu,u<s => s=v_ (antisymmetric). 
We just indicate the proof of antisymmetry. Suppose that s < vandv < s, buts 4 
v. Then, there exist positive rational numbers ¢{, and ¢4 such that Sm, < Um — €4 


form > My and vm < 8m — €4 form > Mg. Hence, form > max(M,, M2), 
we have €5 < 8m — Um < —€{, which is a contradiction. 


(1.10) Lemma. The order < of (1.18) is total, i.e., for any two real numbers s and 
v with s £ v we have either s < voru < 8. 
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Remark. s # v is the negation of s = v, which is expressed by Eq. (1.16). In 
order to formulate the negation of a statement like (1.16), we recall a little bit of 
logic. Let S'\(x) be a statement depending on x € A (A is some set) and —S(z) its 
negation. Then, we have ? 


VaeA S(x) is the negation of dae A 7S(x), 
4JxzeA S(a) is the negation of Vae A AS(x). 


In order to obtain the negation of a long statement we have to reverse all quantifiers 
(V < 4) and replace the final statement by its negation. Hence, s ¥ v is obtained 
from (1.16) as 


(1.19) e>0 VN>1 dn>N_ |sn—al >. 


Proof of Lemma 1.10. Let s = {s,} and v = {v,,} be two distinct real numbers, 
such that (1.19) holds. We then put e’ = ¢/3. Since {s,,} and {v,} are Cauchy 
sequences, there exists Ny, such that |s, — Sn44| < ©’ forn > Ny andk > 1 and 
there exists No such that |v, — Unte| < ¢’ forn > No and k > 1. We then put 
N = max(Nj, N2) and deduce from (1.19) the existence of an integer n > N 
such that |s,, — U,| > €. There are two possibilities, 


(1.20) Sn — Un > € or Un — Sn > €. 
>e pe 


FIGURE 1.3. Illustration of the two cases in (1.20) 


For k > 1 the numbers s,,4, and vn+, stay in the disks of radius ec’ = €/3 (see 
Fig. 1.3). Therefore, (1.18) is satisfied with I = N and we have s > v in the first 
case, whereas v > s in the second case. 


Absolute Value. Once we have shown that the order is total (Lemma 1.10), it is 
possible to define the absolute value of a number s as being s (for s > 0) and —s 
(for s < 0). An easy consequence of this definition is that 


(1.21) ls|={|sn]} for = s = {sn}. 


The triangle inequality |s + v| < |s| + |v| and all its consequences are valid for 
real numbers. 


Remark. In the Definitions and Theorems 1.1 through 1.7, we have not been very 
precise about the concept of “number”. To be logically correct, they should have 


> The statement “all (V) polar bears are white” is wrong if there exists (4) at least one 
colored (nonwhite) polar bear; and vice versa. 
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been stated only for rational numbers. After having now introduced with much 
pain the concept of real numbers, we can extend these definitions to real numbers 
and check that the statements of the theorems remain valid also in the more general 
context. 


Proof of Theorem 1.8. 
... until now these propositions were considered axioms. 
(Méray 1869, see Dugac 1978, p. 82) 
Let {s;} be a Cauchy sequence of real numbers, such that each s; itself is an 
equivalence class of rational Cauchy sequences, i.e., 5; = { Sin aS The idea is 
to choose for each i a number becoming smaller and smaller (for example 1/27) 


and to apply the definition of a rational Cauchy sequence in order to obtain 
_ 1 
4 


We then put v; := s;,1, and consider the rational sequence {v;} (see Fig. 1.4). 


aS 1/2 a 
Ss] | | bal 1 LT TTI 
eee <4 
2 | | | | PAUP, 
Vo ee E < 1 /6 
53° | | | | id 
UR. 1/8 


S4e | | | ot TUTTI Uo 


V4 


= lins 
FIGURE 1.4. Convergence of a Cauchy sequence 


a) We first prove that |v; — s;| < 1/7. By (1.21), the real number |v; — s,| is 
represented by the rational Cauchy sequence {|v;— Sim|}m>1. Since, form > Ni, 
los — Siml =| ——— 
Vi — Sim| = |$i,N; — Sim Sate Gt 6 pe 
sale 2 1 2 
it follows from (1.18) with e’ = 1/2: that |v; — s;| < 1/i. 
b) We next prove that {v;} is a rational Cauchy sequence. Observing that 
|v; — vi+n~| does not change its value if it is considered as a rational or a real 
number, we have 


\ug — Vige| = |Ui — Si + 54 — Si-h + Sith — Vi4e| 


1 1 
(1.22) < |v; _ 8;| + [55 = Si+k| + [Sith = Vitk| <++¢e+—— <2e 
a itk 
for sufficiently large i and for & > 1. The equivalence class of {v,,}, denoted by 
8 := {vp}, will be our candidate for the limit of {s;}. It follows from (1.22) that 
|v; — s| < 3e (for large enough 2) so that v; > s. 
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c) We finally prove that s; — s. From parts (a) and (b) of this proof and from 
the triangle inequality, we have 


1 
|si— 8] S |si — vi] + vi — 8] < 5 + 3e < 4e 


for sufficiently large 7. Hence, s; — s, and Theorem 1.8 is proved. 


Monotone Sequences and Least Upper Bound 


Our next aim is to prove rigorously the fact that a majorized monotonically in- 
creasing sequence converges to a real limit. This result has been used repeatedly 
in Chap. II, especially in Sect. 11.10. 


(1.11) Definition. Let X be a subset of R. A real number € is called the least 
upper bound (or supremum) of X if 


i) VaeX au<€, and 
it) Ve>0O AtEX a>E-e. 


We then write € = sup X. 


Condition (i) expressses the fact that € is an upper bound of X, whereas 
condition (ii) means that € — € is no longer an upper bound, so that € is really the 
smallest of all upper bounds. Our next result investigates the existence of such a 
supremum: “This Theorem is ...” as Bolzano wrote in 1817, “... of the greatest 
importance” (see Stolz 1881, p.257). It is based on Theorem 1.8 and is not valid 
in Q (the set X = {x € Q| a? < 2} does not have a supremum in Q). 


Sy S> Sz Sy B 
| | | | 
| ea 
© eo @¢e ccccmmmmm 
< Oo ae By 
Y Oy B; 
a, -—+— B, 
1, 4 B; 
FIGURE 1.5. Existence of the least upper ao, By 
bound for a monotone sequence 0.5 H Bs 


(1.12) Theorem. Let X be a subset of R that is nonempty and majorized (i.e., 
4B VaeEX «x < B). Then, there exists a real number € such that € = sup X. 


Proof. On Bolzano’s tracks (but also on Euclid’s, Elements, Book X), we do the 
proof by bisection. We shall construct nested intervals [a,,, 3,] with lengths de- 
creasing geometrically to zero, such that @,, is not an upper bound of X but 3, is 
one. 
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Since X is nonempty, we can find a number qo that is not an upper bound 
(choose an element x of X and take ag to the left of 7). Our second assumption 
(X is majorized) implies the existence of an upper bound. We choose one and call 
it Go. The idea is then to consider the midpoint 7 = (ao + 8o)/2 (see Fig. 1.5). 
There are two possibilities: either y is an upper bound of X (in this case, we 
set @1 := Qo and ({, := 4) or it is not (then, we put a, := y and G, := (3). 
Repeating this procedure, we find a sequence of intervals [a,, 3,,| with lengths 
Bn An = (80 = ao) /2”. 

By construction we see that all successors of a, and (3, lie inside the interval 
[Qn, Bn]. Consequently, we have the estimates 


Bo — eo 


Bo = o0 
Qn? : 


Bn — Br+x| < Bn -— An = Qn 


lan = OAn+k| < Bn An = 
This shows that {a,,} and {@,,} are Cauchy sequences. By Theorem 1.8, they are 
convergent, and, since By, — @p, = (Go — ao) /2" — 0, they have the same limit 
€ (Theorem 1.5). It now follows from Theorem 1.6 that € is an upper bound of X 
(x < GB, implies x < €). Furthermore, for a given e > 0, there is an a, satisfying 
Qn > € —€. Since a, is not an upper bound of X, € — € cannot be one either. 


(1.13) Theorem. Consider a sequence {s,} that is monotonically increasing 
(Sn < Sn41) and majorized (8, < B for all n). Then, it converges to a real 
limit. 

Proof. By hypothesis, the set X = {s1, 52, 83,...} is nonempty and majorized 
(see Fig. 1.5). Therefore, € = sup X exists by Theorem 1.12. By the definition of 
sup X, the value € — ¢ is, for a given e > 0, not an upper bound of X. Conse- 
quently, there exists an N such that sy > € — e. Since X is majorized by £, we 
have 


E-—E€< SN < SN41 < SN42 < SN43 <... SE, 


so that € —€ < 8, < € (and thus |s,, — | < ¢) forall n > N. This proves the 
convergence of {s,,} to €. 


(1.14) Corollary. Consider two sequences {8,,} and {vy}. Suppose that {8} is 
monotonically increasing (87, < 841) and that 8, < Up for all (sufficiently large) 
n. Then, we have 


{un} converges => {s,} converges, 


{s,} diverges => {un} diverges. 


Proof. If {vy} converges, then it is bounded by Theorem 1.3. Hence, {s,,} is also 
bounded and its convergence follows from Theorem 1.13. The second line is the 
logical reversion of the first one. 


Remark. In an analogous way, we define the lower bound of a set, we define mi- 
norized and monotonically decreasing sequences, and we use the notation 
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(1.23) €=inf X 


for the greatest lower bound or infimum of X (i.e., x > € for all c € X and 
Ve>0O da € X with x < €+ 6). There are theorems analogous to Theorems 
1.12 and 1.13. 


Accumulation Points 


I find it really surprising that Mr. Weierstrass and Mr. Kronecker can attract 
so many students — between 15 and 20 — to lectures that are so difficult 
and at such a high level. 

(letter of Mittag-Leffler 1875, see Dugac 1978, p. 68) 


The sequence 


(1.24) ~ ~, os S a a) 


sd Ww ae Roe, Ww 1 


does not converge, but if every other term is removed, it converges either to 0 or 
to 1. A sequence with missing terms is a “subsequence”. More precisely, 


(1.15) Definition. A sequence {s/,} is called subsequence of {8} if there exists 
an increasing mapping o : N — N with si, = 8 (,) (increasing means that 
a(n) < o(m) ifn <m). 


(1.16) Definition. A point s is called an accumulation point of a sequence {8}, 
if there exists a subsequence converging to s. 


Examples. The points 0 and 1 are accumulation points of the sequence (1.24). An 
interesting example is the sequence 


1121231234123453123456421 \ 


which admits al] numbers between 0 and 1 (0 and 1 included) as accumulation 
points. To see that, for example, In 2 is an accumulation point of (1.25), consider 
the sequence 


{° 69 693 6931 69314 693147 \ 


It is certainly included somewhere in (1.25) and converges to In 2. 
The unbounded sequences {1, 2,3,4,5,...}, {—1, —2, -3, —4, —5,...} and 
{1,—1, 2, —2,3,—3, 4, —4,...}, don’t have accumulation points. 


(1.17) Theorem of Bolzano-Weierstrass (Weierstrass’s lecture of 1874). 
A bounded sequence { 8y,} has at least one accumulation point. 


Proof. Weierstrass’s original proof used bisection, as in the proof of Theorem 1.12. 
Having this theorem at our disposal, we consider the set 
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ti 


FIGURE 1.6. Proof of the theorem of Bolzano-Weierstrass 


(1.26) X= {a | Sp, > « for infinitely many ae 


and simply put € = sup _X, which will turn out to be an accumulation point (see 
Fig. 1.6). This number exists because X is nonempty and majorized (the sequence 
{s,,} is bounded). By definition of the supremum, only a finite number of s,, can 
satisfy s,, > € + and there is an infinity of terms s,, that are larger than € — € (€ 
is an arbitrary positive number). Hence, an infinity of terms s,, lie in the interval 
[6 -—e,€ +e]. 

We now choose arbitrarily an element of the sequence that lies in [€—1, €+ 1] 
and we denote it by s4 = 8,1). Then, we choose an element in [€ — 1/2,€ + 
1/2] whose index is larger than o(1) (this is surely possible since there must be 
infinitely many) and we denote it by s, = s, 2). At the nth step, we choose for 
Si, = So(n) an element of the sequence that lies in [€ — 1/n, € + 1/n] and whose 
index is larger than o(n — 1). The subsequence obtained in this way converges to 
&€, because |s, — €| < 1/n. 


Remark. This proof did not exhibit an arbitrary accumulation point but precisely 
the largest accumulation point. We call it the “limit superior’ of the sequence and 
we denote it by 


(1.27) € =limsup sy, = sup{a ER | Sy > x for infinitely many n} 
(see also Exercise 1.12). The smallest accumulation point is denoted by 
(1.28) € = liminf s, =inf{x € R| s, < « for infinitely many n}. 


Example. For the sequence {3, 4, $, 3, 3, + 8, +, 


lim supy_.oo Sn = 1, ee Sn = 0, Gah = 3/2, ae = —1/2. 


rime 
606 ..}, we have 


6? 


Exercises 


1.1 (Triangle inequality). Show, by discussing all possible combinations of signs, 
that for any two real numbers u and v we have 


(1.29) ju+v| < jul + fol. 
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1.2 


1.3 


1.4 


1.5 


1.6 


1.7 
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Then, show that for any three real numbers wu, v, and w we have 
(1.29') ju—w| <|u-—v|+lu—wI. 

Show that the sequence {s,,} with 


2n—1 
n+3 


n= 


converges to s = 2. Fora givene > 0, say fore = 10~°, find a number N 
such that |s, — s| <eforn > N. 


Show that the sequences 


i 1 1 is 1 
Sn = +o t+ ee t+ See te + 
[5 867 O80. Ped (Qn — 1I)2n +3) 

se am ho Eo : 
oe ee Dae d  Sedeg Gt Lb tee) 


are Cauchy sequences and find their limits. 
Hint. Decompose the rational functions into partial fractions. 


Construct sequences s, and v, with lim s, = co and lim vy, = 0 to 
illustrate each of the following possibilities. rs 
a) lim (8), + Un) = 00; 
n—00 
b) jim, (Sn * Un) = C, where c is an arbitrary constant; and 
C) Sn * Un is bounded but not convergent. 


Consider the three sequences 


n 


$n = Vn+1000— Vn, vn =V/n+JS/n—-Vn, un = Coan 


=i. 


Show that s, > Un > Un forn < 10° and compute lim s,, lim vp, 
n— oo n— oo 


lim Un, if they exist. Arrange these limits in increasing order. 


n—oco 


Show with the help of the estimates of Exercise [.2.5 that 


1\7 
Un = (1 + —) 
n 
is a Cauchy sequence. Find, for ¢ = 10~°, an integer N such that |v, — 
Untk| < €forn > Nandk > 1. 


For two rational Cauchy sequences {a,,} and {b,,}, we denote by {ay - bn} 
the sequence formed by the products term by term. Show 

a) the sequence {a,, - b,,} is again a Cauchy sequence; and 

b) if {an} ~ {sn} and {b,} ~ {vn} as defined in (1.16), then {an + bn} ~ 
{Sy - Un}. This shows that the product of two real numbers defined in (1.17) 
is independent of the choice of the representatives. 
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1.8 Show the following: if s is the only accumulation point of a bounded se- 
quence {s,,}, then the sequence is convergent and limy_,o5 Sn = s. Show by 
a counterexample that this property is not true for unbounded sequences. 

1.9 (Cauchy 1821, p.59; also called “Cesaro summation’’). Let limp. an = a 
and 


Show that limy_.oo bn = a. 
1.10 Let a be an irrational number (for example, a = \/2 ). Consider the sequence 


{s,,} defined by 
Sn = (na) mod 1, 


Le., S, € (0,1) is na with the integer part removed. Compute 51, s2, 83, 


$4,... and sketch these values. Show that every point in [0, 1] is an accumu- 
lation point of this sequence. 
Hint. For ¢ > O and n > 1/e at least two points among 81, 52,..., 8n41 (call 


them s, and s,4¢) are closer than e. Then, the points sz, Spie, Sp+20,--- 
form a grid with mesh size < e. 
Remark. At the beginning of the computer era, this procedure was the stan- 
dard method for creating pseudo random numbers. 
1.11 Let {s,,} and {v,,} be two bounded sequences. Show that 
lim sup (Sn + v,) < limsup s, + lim sup vp, 


n—Cco n—- oo n—-oco 


lim inf (s, + v,) > liminf s, + lim inf vp. 
noo noo n—-co 


Show with the help of examples that the inequality can be strict. 
1.12 Prove that for a sequence {s,,} we have 


limsups, = lim vp, where Ui, = sup{ Sp, Sn41; Sn+25-- me 
n—0o N70 


1.13 Compute all accumulation points of the sequence 


k 
1 
{sn} = {pu, P21, P22, P31, P32, P33, P41, P42,-- me Pre = s he 
ixe 


Show that (see Eq. (I.5.23)) limsup s, = 7/6 and that liminf s, = 0 (see 
Fig. 1.7). 


77/6 — 1 


FIGURE 1.7. Sequence with a countable number of accumulation points 
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I shall devote all my efforts to bring light into the immense obscurity that 
today reigns in Analysis. It so lacks any plan or system, that one is really 
astonished that so many people devote themselves to it — and, still worse, 
it is absolutely devoid of any rigour. 

(Abel 1826, Oeuvres, vol. 2, p. 263) 


Cauchy is mad, and there is no way of being on good terms with him, 
although at present he is the only man who knows how mathematics should 
be treated. What he does is excellent, but very confused .. . 

(Abel 1826, Oeuvres, vol. 2, p. 259) 


Since Newton and Leibniz, infinite series 
(2.1) ag +a, +d2+043 +... 


have been the universal tool for all calculations (see Chap. I). We will make precise 
here what (2.1) really represents. The idea is to consider the sequence {s,,} of 
partial sums 


n 
(2.2) 80 = Qo, $1 =a) + 4, 325 Sn = y ai, 
i=0 


and to apply the definitions and results of the preceding section. A classical refer- 
ence for infinite series is the book of Knopp (1922). 


(2.1) Definition. We say that the infinite series (2.1) converges, if the sequence 
{Sn} of (2.2) converges. We write 


FIGURE2.1. “Geometric” view of the geometric series 


(2.2) Example. Consider the geometric series whose nth partial sum is given by 
S, =l+qt@+t...+q” (see Fig. 2.1). Multiplying this expression by 1 — q, 
most terms cancel, and we get (for g 4 1) 

1 grt} 


Sn =l+tqtq?+...¢+qr= ar 
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From Lemma 1.4, together with Theorem 1.5, we thus have 


al 


— if |q| <1, 
tbc Ree hg GE Asatte a 
asd q q q "7" "" ) diverges — oo if q>1, 
diverges if q<-l. 


Criteria for Convergence 


Usually it is not possible to find a simple expression for s,, and it is difficult to 
compute explicitly the limit of {s,,}. In this case, it is natural to apply Cauchy’s 
criterion of Theorem 1.8 to the sequence of partial sums. Since sn4% — Sn = 
An+1 + a4n42+...+Gn+k, We get 


(2.3) Lemma. The infinite series (2.1) converges to a real number if and only if 


Ve>O0 IN>0 Vn >N VK>1 langitanget..-+antn| <e. 


Putting k = 1 in this criterion, we see that 


(2.3) lim a; =0 


Iwo 
is a necessary condition for the convergence of (2.1). However, (2.3) is not suffi- 
cient for the convergence of (2.1). This can be seen with the counterexample 


i neh a leet LS ces eg ene ae Ree a 
~+-4+ 5-454 5-4-4 -4+-4-4 44+... 5 0. 
ei Sic: Bae ae aa a a a 


In what follows, we shall discuss some sufficient conditions for the convergence 
of (2.1). 
Leibniz’s Criterion. Consider an infinite series where the terms have alternating 


signs 


(2.4) ao — a1 + ag —-a3+ a4 = 0(-1)fas. 
i>0 


(2.4) Theorem (Leibniz 1682). Suppose that the terms a; of the alternating series 
(2.4) satisfy for alli 


a, > 0, Ai41 < a, lim a; = 0; 
1— 00 
then, the series (2.4) converges to a real value s and we have the estimate 


(2.5) |s Sn| < Qn4+1; 


i.e., the error of the nth partial sum is not larger than the first neglected term. 
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FIGURE 2.2. Proof of Leibniz’s criterion 


Proof. Denote by s,, the nth partial sum of (2.4). It then follows from the mono- 
tonicity assumption that sox41 = Sap—1 + Gok — Gex41 > Sak—1 and that 
S$2k4+2 = S2k — Aon41 + Gek+2 < Sax. From the positivity of a2~41, we have 
$2k+1 < S2x SO that, by combining these inequalities, 


8, S83 85 S87 <2... S86 S84 < SQ < SQ 


(see Fig. 2.2). Consequently, s,,.+, lies for all k between s,, and s,,41, and we have 
(2.6) [Sn+k _ Sn| < |Sn41 3 Sn| = An+1- 


This implies the convergence of {s,,} by Theorem 1.8, since a,41 tends to 0 for 
n — oo. Finally, the estimate (2.5) is obtained by considering the limit k — oo in 
(2.6) (use Theorem 1.6). 


Examples. The convergence of (see (1.4.29) and (I.3.13a)) 


Le 1. de od 
1-=-+s-<s+... d 1-=+-=-- 
ee ore = 2 ae a 
is thus established. However, we have not yet rigorously proved that the first sum 
represents 7/4 and the second one In 2 (see Example 7.11 below). 
If a continued fraction (I.6.7) is converted into an infinite series, we obtain 
(see Eq. (1.6.16)) 
Pl Pip2 , Pip2P3 — Pip2P3P 


os B, BiB. BoB;  — B3By_ 


oe Ae 


Assuming that the integers p; and q; are positive, this is an alternating series (from 
the second term onward). Furthermore, we have By, = qxBrp—1 + prBr—2 > 
pr Bx—2, implying that the terms of the series are monotonically decreasing. Under 
the additional assumption that 0 < p; < q; for all 2 > 1 (see Theorem 1.6.4), we 
have 

Br Br-1 = @Bp_1 + PrBr-1Br—2 > 2peBr—1Br-2 


and consequently also By, Br_-1 > 2-1 D1 -...+p,. This proves that the terms 
of the series tend to zero and, by Theorem 2.4, that the series under consideration 
converges. 
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Majorizing or Minorizing a Series. For infinite series with non-negative terms 
the following criterion is extremely useful. 


(2.5) Theorem. Suppose that 0 < a; < 0; for all (sufficiently large) i. Then 


reo bi. converges => 9 Gi converges, 
para a; diverges => Sear, b; diverges. 
Proof. Putting s, = >>; a; and vu, = >>; 4 6;, this result is an immediate 


consequence of Corollary 1.14. 


As a first application, we give an easy proof of the divergence of the harmonic 
series yo Sy + (N. Oresme, around 1350; see Struik 1969, p. 320). We minorize 
this series as follows: 


ips 


+ 
ea it 1 1 1 1 1 1 1 1 
Va=lt+etgtetststetstistist-: 
—_—_—S ee 


1 1 1 1 1 1 1 1 1 
ra +e getats +g +t: +e +77 
4 


1/2 1/2 1/2 


Since )~ a; diverges, it follows from 0 < a; < b; that the harmonic series )~ b; 
diverges too. 
As a further example, we consider the series (1.2.18) for e” (e.g., for x = 10), 
10? LO? 10 3102 
(2.7) ae ea ae a aa 
We omit the first 10 terms (this does not influence the convergence), and compare 
the resulting series with the geometric series (Example 2.2 with g = 10/11 < 1) 


2Oe E a  : T iii. i191 
2 — , 10, 10° | 10° ) 
~ 10! 1.12 1B 
The convergence of the geometric series implies the convergence of (2.7). Simi- 
larly, one can prove that the series (I.2.18) converges for all x. This comparison 


with the geometric series will be used on several occasions (see Criteria 2.10 and 
2.11, Lemma 7.1, and Theorems 7.5 and 7.7). 


10 2 TOE «. LOee 107° (1 phi, MOA 5 Oe TOs 10 ) 


(2.6) Lemma. The series 


1 1 1 1 i 
2: —+—4+--4+—4+-—+... 
(2.8) je) oe | Sa get eat 
converges for all a > 1. It diverges for a < 1. 


Proof. The divergence of the series for a = 1 (harmonic series) has been estab- 
lished above. For a < 1 the individual terms become still larger, so that the series 
diverges by Theorem 2.5. 
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We shall next prove the convergence of (2.8) for a = (k+1)/k, where k > 1 
is an integer. The idea is to consider the series 


1 1 1 
ee =e ee 


which converges by Leibniz’s criterion. The sum of two successive terms can be 
minorized as follows: 


1 1 Ya- Val eis 1 
Yat Yai Yar 1. Pa TF” 
where C;, = 1/(k- 2*+1)/*) is a constant independent of i. The last inequality in 
(2.9) is obtained from the identity a* — b* = (a — b)(a*~1 + a®-7b 4 ab 302 + 
+ b*-!) witha = Yi andb = 4/27 —Tas follows: 


k/- k/- _ 1 mew Es 
V2 — VY2i-1= QDE-DIF YT... + (Qi — DE-DE z Es (2i)-V)/k 


Thus, by Theorem 2.5, the series (2.8) converges for a = (k + 1)/k. 
Finally, for an arbitrary a > 1 there exists an integer k with a > (k + 1)/k. 
Theorem 2.5 applied once more then shows convergence for all a > 1. 


(2.9) 


Absolute Convergence 


Example. The series 


Lee Le dd 
2.10 l-=+=---+2=-<++... 
(2.10) Big ghee 
is convergent by Leibniz’s criterion (actually to In 2). If we rearrange the series as 


follows: 


, 1 Lyd 1 11 1 iil 1 thn 
24 3 6 8 5 10 12 7 14 16 °°’ 
SS ee See ee ee” —S ee 
1/2 1/6 1/10 1/14 
we obtain 
1 Lcd tod = ie Dal 
2 4 6 8 10 "" 2 DB Ae Bh Ye 


which is now half as much as originally. This shows that the value of an infinite 
sum can depend on the order of summation. 


(2.7) Definition. A series )7° 0 a’, is a rearrangement of )\;~. 9 di, if every term 
of ea ai, appears in ae a‘, exactly once and conversely (this means that 
there exists a bijective mapping g : No — No such that a, = ag); here 
No = {0,1,2,3,4,...}). 
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Explanation. An elegant explanation for the above phenomenon was given by 
Riemann (1854, Werke, p. 235, “... ein Umstand, welcher von den Mathematikern 
des vorigen Jahrhunderts tibersehen wurde . . .”). In fact, Riemann observed much 
more: for any given real number A it is possible to rearrange the terms of (2.10) 
in such a way that the resulting series converges to A. The reason is that the sum 
of the positive terms of (2.10) and the sum of the negative terms, 


eee eee and 
3°95 7 9 7" 2 4 6 8 10 7°? 


are both divergent (or equivalently: the series (2.10) with each term replaced by 
its absolute value diverges). 

The idea is to take first the positive terms 1+1/3+... until the sum exceeds 
A (this certainly happens because the series with positive terms diverges). Then, 
we take the negative terms until we are below A (this certainly happens because 
—1/2—1/4-—... diverges). Then, we go on adding positive terms until A is again 
exceeded, and so on. In this way, we obtain a rearranged series that converges to 
A (cf. examples in Fig. 2.3). 


13 » 
1.2 pm 
Lit 
1.0%) 


FIGURE 2.3. Rearrangements of the series (2.10) 


(2.8) Definition. The series (2.1) is absolutely convergent if 
|ao| + |aa| + laa] + Jas] +... 
converges. 


(2.9) Theorem (Dirichlet 1837b). If the series Sar a; is absolutely convergent, 
then all its rearrangements converge to the same limit. 


Proof. By Cauchy’s criterion, absolute convergence means that 


YVe>O AN>O Vn>N VES1 angi) + lanzal +... + lanzr| <e. 
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For a given € > 0 and the corresponding N > 0 we choose an integer M in 
such a way that all terms ao, a@1,...,@n appear in the Mth partial sum 54, = 
se a’, of the rearrangement. Therefore, in the difference s,,, — s/,,, all the terms 
do, @1,...,@n disappear (form > M) and we have 


[Sm — Sml < lan4i| + law+e|+--.+ lan+el <e, 


where k is a sufficiently large integer. This shows that s,,, — s/,, — 0 and that the 
rearrangement converges to the same limit as the original series. 


We next present two criteria for the absolute convergence of an infinite se- 
ries. 


(2.10) The Ratio Test (Cauchy 1821). If the terms ay, of the series (2.1) satisfy 
(2.11) Himsup St <1, 
n—-0o An 


then the series is absolutely convergent. If liminfn—+oo |Qn4i|/|@n| > 1, then it 
diverges. 


Proof. Choose a number gq that satisfies lim sup,,_,5, |@n4i|/|@n| <q < 1. Then, 
only a finite number of quotients |an,+41|/|@n| are larger than g and we have 


Jan+1| 
lan| 


N>O0O Vn>N <q. 


This, in turn, implies |Jay4i| < glan|, |an+2| < q?lan], |an+3| < q?lan|, ete. 
Since the geometric series converges (we have 0 < q < 1), the series }7,., |ai 
also converges. 

If lim infp—+oo |@n+1|/|an| > 1, then the sequence {|a,,|} is monotonically 
increasing for n > N and the necessary condition (2.3) is not satisfied. 


Examples. The general term of the series for e” is a, = x” /n!. Here, we have 
lan4il/|@n| = |2|/(n + 1) — 0 so that the series (1.2.18) converges absolutely 
for all real x. Similarly, the series for sin x and cos x converge absolutely for all x. 

For the series (2.8) this criterion cannot be applied because |an41|/|an| = 
(n/(n + 1))* > 1. 


(2.11) The Root Test (Cauchy 1821). /f 


(2.12) lim sup \/|@n| < 1, 


n—0o0o 


then the series (2.1) is absolutely convergent. If limsup,_..5 ¥/|an| > 1, then it 
diverges. 


Proof. As in the proof of the ratio test, we choose a number q < 1 that is strictly 
larger than lim sup,,_,,, ¥/|an|. Hence, 
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IN>O0 Vn>N VWlan| <¢@. 


This implies |a,,| <q” for n > N, and a comparison with the geometric series 
yields the absolute convergence of 57>?" a;. If limsup,_,.. ¥/|@n| > 1, then the 
condition (2.3) is not satisfied and the series cannot converge. 


Double Series 


Consider a two-dimensional array of real numbers 


a@o0 +r @o1 +r 402 Tr G03 Tees = SO 
+ + + + + 
aji0 7 411 rT 42 Tr @13 Te... = S1 
+ + + + + 
20 Tr 421 Tr G22 Tr 423 T-.. = 82 
(2.13) - + + + + 
430 +r 431 Tr 432 Tr 433 Tee. = $3 
+ + + + + 
VO + UL + v2 + U3 + Sted => 22? 


and suppose we want to sum up all of them. There are many natural ways of doing 
this. One can either add up the elements of the ith row, denote the result by s;, 
and then compute eer 8;; or one can add up the elements of the 7th column, 
denote the result by v;, and then compute ee, v;. It is also possible to write all 
elements in a linear arrangement. For example, we can start with ago, then add the 
elements a;; for which i+ 7 = 1, then those with 7 + 7 = 2, and so on. This gives 


(2.14) ao00 a (a10 + a1) + (a20 + Qi a a2) a (a30 + ie ) + 5 els 
Here, we denote the pairs (0,0), (1,0), (0,1), (2,0),... by o(0), o(1), o(2), 
o(3),..., 80 that o isa map o : No > No x No, where No x No = {(i, 7) |4 € 


No, j € No} is the so-called Cartesian product of No with No. So, we define in 
general, 


(2.12) Definition. A series pia by is called a linear arrangement of the double 
series (2.13) if there exists a bijective mapping o : No — No x No such that 
br = Go(k)- 


The question now is: do the different possibilities of summation lead to the 
same value? Do we have 


(2.15) sotsit--= > (do ay) = Olay) =m tnt..., 
i=0 j=0 j=0 1=0 


and do linear arrangements converge to the same value? 
The counterexample of Fig.2.4a shows that this is not true without some 
additional assumptions. 
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Ce ee el 
Gone is mi 
a ae aa ; Gal ie ia 
+ ae ee ee: ate + a ee 
el nil 

12 oe 0202 =170 7 + 
bull 

Co YT [eneme 

FIGURE 2.4a. Counterexample FIGURE2.4b. Double series 


(2.13) Theorem (Cauchy 1821, “Note VII’’). Suppose for the double series (2.13) 
that 


(2.16) AB>0 Vm>0 SYS ail < B. 
i=0 j=0 


Then, all the series in (2.15) are convergent and the identities of (2.15) are satis- 
fied. Furthermore, every linear arrangement of the double series converges to the 
same value. 


Proof. Let bo) +b; +62+... bea linear arrangement of the double series (2.13). The 
sequence {}~;' , |b;|} is monotonically increasing and bounded (by assumption 
(2.16)) so that ie |b;|, and hence also ar b;, converge. Analogously, we can 
establish the convergence of 5; = ))j~9 aij and vj = D070 aij. 

Inspired by the proof of Theorem 2.9, we apply Cauchy’s criterion to the 
series )>7~5 |b;| and have 


YVe>0O0 SIN>0 Vn>N VR>E1 |bpgil + longo) +... + lOntn| < e- 


For a given € > 0 and the corresponding N > 0 we choose an integer / in 
such a way that all elements bo, b1,...,by are present in the box 0 <i < M, 
0< 7 < M (see Fig. 2.4b). With this choice, bo, b1,..., bn appear in the sum 
a b; (for 1 > N) as well as in S77", Dg aj; (form > M andn > M). 
Hence, we have for! > N,m>M,n>M, 


mn I 
(2.17) So So aig — D0 bi] < [bwaal +... + lbwael <e, 
1=0 


i=0 j=0 


with a sufficiently large k. We set s = pean b; and take the limits 1 — oo and 
n — oo in (2.17). Then, we exchange the finite summations }7j" 059 
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io Doio and take the limits 1 + oo and m — oo. This yields, by Theorem 


1.6, 
S- 5, —S 
i=0 


Hence >? 79 $i and )° 5°, vj both converge to the same limit s. 


n 


) Ui — § 


j=0 


<e and <e. 


The Cauchy Product of Two Series 


If we want to compute the product of two infinite series 577°) a; and a, bj, 
we have to add all elements of the two-dimensional array 

aobo aobi aobz aobs 

abo aibi abe abs 
(2.18) abo ab agbe agbs 


azbo a3b1 agb a3b3 


If we arrange the elements as indicated in Eq. (2.14), we obtain the so-called 
Cauchy product of the two series. 


(2.14) Definition. The Cauchy product of the series )~?* . a; and pe, b; is de- 
fined by 


Co 


»( an—j° si) = aobo + (agby + aybo) + (agbe + a,b, + azbo) Serax Ge se 
j=0 


n=0 


The question is whether the Cauchy product is a convergent series and 
whether it really represents the product of the two series > i>0 Ui and >> 7>0 b;. 


(2.15) Counterexample (Cauchy 1821). The series 
1 oe 1 1 sf 1 
V2 V3 V4 VB 


converges by Leibniz’s criterion. We consider the Cauchy product of this series 
with itself. Since 
2n+2 


n n 1 
Hag bg) = a 7 ee 
De 5°; Gage = 


(the inequality is a consequence of (n+1—2)(a+1) < (1+n/2)? for0 < x <n), 
the necessary condition (2.3) for the convergence of the Cauchy product is not 
satisfied (see Fig. 2.5). This example illustrates the fact that the Cauchy product 
of two convergent series need not converge. 


1 
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apt 
FIGURE2.5. Divergence of the Cauchy product of Counterexample 2.15 


(2.16) Theorem (Cauchy 1821). /f the two series Se a; and ae, b; are ab- 
solutely convergent, then its Cauchy product converges and we have 


a (Soa) (Eh) =F (Sees) 


i=0 n=0 


Proof. By hypothesis, we have 577° |a;| < By and aa |b;| < Bo. Therefore, 
we have for the two-dimensional array (2.18) that for all m > 0 


SSS laillbj| < Bi Be, 


i=0 j=0 
and Theorem 2.13 can be applied. The sum of the ith row gives s; = a; - 6 b; 


and yy 9 81 = (YYp20 i) (jz 67). By Theorem 2.13, the Cauchy product, 
which is a linear arrangement of (2.18), also converges to this value. 


Examples. For |q| < 1 consider the two series 


1 1 
tg te fo. 2S —— and La ga ae hs Se 
ty 1+q 
Their Cauchy product is 
ol 
=a 
which, indeed, is the product of (1 — g)~! and (1+ q)71. 
The Cauchy product of the absolutely convergent series 


l+@Ptqt+at... 


g2 8 yr ye 
e Ta Ro i agp as and e SAPO op th ap hoe 


gives the series for e”*Y (use the binomial identity of Theorem I.2.1). 
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Remark. The statement of Theorem 2.16 remains true if only one of the two se- 
ries is absolutely convergent and the second is convergent (F. Mertens 1875, see 
Exercise 2.3). 

Under the assumption that the series 7, a;, >> , 0; and also their Cauchy 
product (Definition 2.14) converge, the identity (2.19) holds (Abel 1826, see Ex- 
ercise 7.9). 


Exchange of Infinite Series and Limits 


At several places in Chap.I, we were confronted with the problem of exchang- 
ing an infinite series with a limit (for example, for the derivation of the series 
for e” in Sect.I.2 and of those for sina and cos in Sect. 1.4). We considered 
series dy, = jo Sn; depending on an integer parameter n, and used the fact 
that limp soo dn = 0 limn—oo 8nj. Already in Sect. 1.2 (after Eq. (1.2.17)), it 
was observed that this is not always true and that some caution is necessary. The 
following theorem states sufficient conditions for the validity of such an exchange. 


(2.17) Theorem. Suppose that the elements of the sequence {s80;, $1;,82;,..-} all 
have the same sign and that |8n+41,;| > |8nj| for all n and j. If there exists a 
bound B such that Yj=0 |Snj| < B for all n > 0, then 


(2.20) lim So sng = >) lim Spy. 
j=0 j=0 


Proof. The idea is to reformulate the hypotheses in such a way that Theorem 2.13 
is directly applicable. At the beginning of this section, we saw that every series 
can be converted to an infinite sequence by considering the partial sums (2.2). 
Conversely, if the partial sums so, 51, 52,... are given, we can uniquely define 
elements a; such that par Q; = Sn. We just have to set a9 = So anda; = 
84 — Sj-1 forz > 1. 

Applying this idea to the sequence { 59;, 51;, 52;,...}, we define 


n 
aoj = $0j, Giz -= Siz — Si-1,); so that ) Qijy = Snj- 
i=0 


Replacing s,,; by this expression, (2.20) becomes 


j=0 i=0 


j=0 i=0 


Exchanging the summations in the expression on the left side of (2.21) (this is 
permitted by Theorem 1.5), we see that (2.21) is equivalent to (2.15). Therefore, 
we only have to verify condition (2.16). The assumptions on {s9,, $1;,...} imply 
that the elements ao;,a1;,... all have the same sign. Hence, we have 
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n 


n n n 
leas] = lena and So leg] Se sal SB: 


i=0 i=0 j=0 j=0 


By Theorem 2.13, this implies (2.21) and thus also (2.20). 


(2.18) Example. We will give here a rigorous proof of Theorem I.2.3. From the 
binomial theorem, we have 


gl) ee) 


y n 
2.22) (142) =1 esses % 
Cee th TP soe = ogee = he 
which is a series depending on the parameter n. We set 
en ecen ap POD , _ ¥O-Ha-4 
nO ’ ni Yy n2 1.2 ’ n3 1-2-3 ’ 

For a fixed y the elements of the sequence {s9;,51;,...} all have the same sign, 
and {|so,|, |s1;|, .. .} is monotonically increasing. Furthermore, we have 

Siwis <p 

nj| > j! = 
j=0 j=0 


because, by the ratio test, 0 |y|?/j! is a convergent series. Hence, Theo- 
rem 2.17 yields 


Exercises 


2.1 Compute the Cauchy product of the two series 


fla)=2-stayn: and gy) =l-a tan 


and find the series for f(x)g(y) + g(x) f(y). Justify the computations. Does 
the result seem familiar? 


2.2 Show that the Cauchy product of the two divergent series 
(242427 +29 4+2¢4...)(-1414+1+14141+...) 


converges absolutely. 


2.3 


2.4 


2.5 
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n 


(Mertens 1875). Suppose that the 
: co f AyDe 

series )7;~) a; is convergent and 

that ae b; is absolutely conver- 

gent. Prove that the Cauchy prod- 

uct of Definition 2.14 is convergent 

and that (2.19) holds. 

Hint. Put cn = ea, Gn—jb; and A EAA 

apply the triangle inequality (but 

only to the first sums) in the iden- Ban Ean 

tity abo 
2n n 2n-J 2n-j 
Sra -(Soa)(S4) = 3 Yat Dm 
i=0 i=0 j=0 i=nt+1 jg=ntl i=0 

Determine the constants a1, a2, a3, a4,... So that the Cauchy product of the 


two series 
(1-a1 +a2—a3+...)(1- a1 +a2—a3 +...) => (1-141-14...) 


becomes the divergent series 1 — 1+ 1 —.... Show that the series 1 — a, + 
ag — a3 +... converges (Fig. 2.6). Can it converge absolutely? 

Hint. The use of the generating function for the numbers 1, —aj, a2, —a3,... 
reduces this exercise to a known formula of Chap. I and to Wallis’s product. 


Lf 


FIGURE2.6. Divergence of the Cauchy product of Exercise 2.4 


Justify Eq. (1.5.26) by taking the logarithm and applying the ideas of Exam- 
ple 2.18. 
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IfI.3 Real Functions and Continuity 


We call here Function of a variable magnitude, a quantity that is composed 
in any possible manner of this variable magnitude & of constants. 
(Joh. Bernoulli 1718, Opera, vol. 2, p. 241) 


Consequently, if f(= + c) denotes an arbitrary function ... 
(Euler 1734, Opera, vol. XXIL, p. 59) 
If now to any x there corresponds a unique, finite y, ... then y is called a 
function of x for this interval.... This definition does not require a com- 
mon rule for the different parts of the curve; one can imagine the curve as 
being composed of the most heterogeneous components or as being drawn 
without following any law. (Dirichlet 1837) 
Real functions y = f(x) of a real variable x were, since Descartes, the universal 
tool for the study of geometric curves and, since Galilei and Newton, for mechan- 
ical and astronomical calculations. The word “functio” was proposed by Leibniz 
and Joh. Bernoulli, the symbol y = f(x) was introduced by Euler (1734) (see quo- 
tations). In the Leibniz-Bernoulli-Euler era, real functions were mainly thought of 
as being composed of elementary functions (“expressio analytica quomodocunque 
.... Sic a+3z, az— 427, az+bVa? — z?, c etc. sunt functiones ipsius 2”, Euler 
1748), perhaps with different formulas for different domains (“curvas discontin- 
uas seu mixtas et irregulares appellamus’’). The 19th century, mainly under the in- 
fluence of Fourier’s heat equation and Dirichlet’s study of Fourier series, brought 
a wider notion: “any sketched curve” or “‘any values y defined in dependence of 
the values x” (see the quotation above). 


(3.1) Definition (Dirichlet 1837). A function f : A — B consists of two sets, the 
domain A and the range B, and of a rule that assigns to each x € Aa unique 
element y € B. This correspondence is denoted by 


y=f@) or ars f(a). 


We say that y is the image of x and that x is an inverse image of y. 


Throughout this section, the range will be R (or an interval) and the domain 
will be an interval or a union of intervals of the form 


(a,b) ={e@ER|a<a<b} or [a,b] ={xeERla<au<b} or 
(a,b). ={xreR|a<x<b} or [a,wo)={reER|a<ax<coo} or.... 


The interval (a, b) is called open, while [a, b] is closed. 
As in the following examples, we usually use braces for functions that are 
defined by different expressions on different parts of A. 


Examples. 1. The function f : [0,1] — R, 

x 0<2<1/2 
a1 7 main = 
Om Ie) fe 1/2<2<1, 


is plotted on the right. We observe that some 
y € R have no inverse image, and that some 
have more than one. O 5 1.0 
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2. Our second function can be defined either 
by a single expression, as a limit, or with 
braces by separating three cases: 


f(x) = lim arctan(nz) 


(3.2) t/2 «>0 
= 40 z=0 
—n/2 «<0. 


3. The following function, which is difficult 
to plot, is due to Dirichlet (see Werke, vol. 2, 
p. 132, 1829, “On aurait un exemple d’une 
fonction ...”): 


QO x irrational 
(3.3) f(x) = { . 
1 x rational. 


4. This function is of a similar nature to Dirich- 
let’s, but the peaks become lower for increas- 
ing denominators of x: 


0 x irrational 


(3.4) f(x) = { 


5. When «x tends to zero, 1/a tends to oo, 
therefore 


will produce an infinity of oscillations in the 
neighborhood of the origin (Cauchy 1821). 


6. Here the oscillations close to the origin are 
less violent, due to the factor x, but there are 
still infinitely many (Weierstrass 1874): 


(3.6) f(z) = one us 


7. Our last example was proposed, accord- 
ing to Weierstrass (1872), by Riemann (see 
Sect. III.9 below) and is defined via an infinite 
convergent sum: 


SS sin(n2x 
B72) f(@)=S> a 


n=1 


1/q  «=p/gqsimpl. fraction. 


n=1,2,4, 8, 16,.. 
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Continuous Functions 
... f(x) will be called a continuous function, if ... the numerical values 


of the difference 


f(@ +a) — f(x) 
decrease indefinitely with those of a... 
(Cauchy 1821, Cours d’Analyse, p. 43) 


Here we call a quantity y a continuous function of x, if after choosing a 

quantity ¢ the existence of 6 can be proved, such that for any value between 

Lo — 6... 29 +06 the corresponding value of y lies between yo —€... yo t€. 

(Weierstrass 1874) 

Cauchy (1821) introduced the concept of continuous functions by requiring that 

indefinite small changes of x should produce indefinite small changes of y (see 

quotation). Bolzano (1817) and Weierstrass (1874) were more precise (second 

quotation): the difference f(a) — f(a) must be arbitrarily small, if the difference 
x — Xo is sufficiently small. 


(3.2) Definition. Let A be a subset of R and xp € A. The function f : A > Ris 
continuous at Xo if for every © > 0 there exists a 6 > 0 such that forallx € A 
satisfying |x — xo| < 6 we have |f (x) — f(xo)| < ¢, or in symbols: 


Ve>0 46>0 VaeEA: |x—aol <b | f(a) — f(xo)| <e. 


The function f (x) is called continuous, if it is continuous at all xq € A. 


See Fig. 3.1a for a continuous function and Figs. 3.1 b—3.1f for functions with 
discontinuities. 


Xo 


FIGURE 3.1. Continuous and discontinuous functions 
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Discussion of Examples (3.1) to (3.7). The function (3.1) is continuous every- 
where, even at % = 1/2; the function (3.2) is discontinuous at 0; (3.3) is dis- 
continuous everywhere; (3.4) is continuous for irrational x9 and discontinuous for 
rational x9 (Exercise 3.1); (3.5) is discontinuous at 7 = 0; (3.6) is continuous 
everywhere, even at x = 0; (3.7), which appears to exhibit violent variations, is 
nevertheless everywhere continuous (as we shall see later in Theorem 4.2). 


(3.3) Theorem. A function f : A — R is continuous at xo € A if and only if for 
every sequence {&n}n>1 With Zp € Awe have 


(3.8) jim, f(an) = f (xo) if lim tp = Xo. 


n—Cco 


Proof. For a given € > 0, choose 6 > 0 as in Definition 3.2. Since x, — 2p, there 
exists N such that |x, — xo| < 6 forn > N. By continuity at xo, we then have 
|f (an) — f(ao)| < ¢ forn > N and (3.8) holds. 

Suppose now that (3.8) holds, but that f(x) is discontinuous at xo. The nega- 
tion of continuity at xo is 


de>0 Vo>0 AXEA: |a—-a2| <6 | f(x) — f(xo)| > e. 


The idea is to take 6 = 1/n and to attach an index n to x (which depends on 0). 
This gives us a sequence {2,,} with elements in A such that |x, — xo| < 1/n 
(hence x», — 2%) and at the same time | f(z») — f(xo)| > ¢. This contradicts 
(3.8). 


(3.4) Theorem. Let f : A — Rand g: A — R be continuous at xp € A and let 
X be a real number. Then, the functions 


f+, A: f, fg, f/9 (i g(x) #0) 
are also continuous at Xo. 


Proof. We take a sequence {x,,} with elements in A and converging to x9. The 
continuity of f and g implies that f(x,) — f(a) and g(a) > g(xo) forn > 
oo. Theorem 1.5 then shows that 


f(@n) + gan) > F (20) + g(20); 


so that f + g is seen to be continuous at xp (Theorem 3.3). 
The continuity of the other functions can be deduced in the same way. 


Example. It is obvious that the constant function f(x) = a is continuous. The 
function f(x) = x is continuous too (choose 6 = ¢€ in Definition 3.2). As a 
consequence of Theorem 3.4, all polynomials P(x) = a9 + aia +... + anx” 
are continuous, and rational functions R(«) = P(x)/Q(a) are continuous at all 
points x, where Q(x) 4 0. 
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The Intermediate Value Theorem 


This theorem has been known for a long time... 
(Lagrange 1807, Oeuvres vol. 8, p. 19, see also p. 133) 
This theorem appears geometrically evident and was used by Euler and Gauss 
without scruples (see quotation). Only Bolzano found that a “rein analytischer 
Beweis” was necessary to establish more rigor in Analysis and Algebra. 


(3.5) Theorem (Bolzano 1817). Let f : [a,b] — R be a continuous function. If 
f(a) < cand f(b) >, then there exists € € (a,b) such that f(€) =c¢. 


Proof. We shall prove the statement for c = 0. The general result then follows 
from this special case by considering f(x) — c instead of f(x). 

The set X = {x € [a,b] ; f(x) < O} is nonempty (a € X) and it is 
majorized by b. Hence, the supremum € = sup X exists by Theorem 1.12. We 
shall show that f(€) = 0 (Fig. 3.2). 

Assume that f(€) = K > 0. We pute = K/2 > 0 and deduce from the 
continuity of f(x) at € the existence of some 6 > 0 such that 


\f(z)-K|< K/2 for |r-—€E| <6. 
This implies that f(x) > K/2 > 0 for € —6 < x < €, which contradicts the fact 


that € is the supremum of X. 
We exclude the case f(€) = K < 0 ina similar way. 


FIGURE 3.2. Proof of Bolzano’s Theorem 


The Maximum Theorem 


With his theorem, which states that a continuous function of a real variable 
actually attains its least upper and greatest lower bounds, i.e., necessarily 
possesses a maximum and a minimum, Weierstrass created a tool which 
today is indispensable to all mathematicians for more refined analytical or 
arithmetical investigations. 

(Hilbert 1897, Gesammelte Abh. , vol. 3, p. 333) 


The following theorem is called “Hauptlehrsatz” (“Principal Theorem’’) in Weier- 
strass’ lectures of 1861 and was published by Cantor (1870). 
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(3.6) Theorem. If f : [a,b] — R is a continuous function, then it is bounded 
on |a, b] and admits a maximum and a minimum, i.e., there exist u € [a,b] and 
U € |a, b] such that 


(3.9) fusf@)sfU) forall xe {a,b. 


Discussion of the Assumptions. The function f : (0, 1] — R defined by f(x) = 
1/x is not bounded on A = (0, 1]. Therefore, the assumption that the domain A 
be closed is important. 

The function f : [0,0o) — R, given by f(x) = x”, shows that the bounded- 
ness of the domain of f(x) is important. 

The function f : [0,1] — R defined by f(1/2) = 0 and 


f(z) = (@—1/2)-* for «41/2 


is discontinuous at x = 1/2 and unbounded. Hence, it is important to assume that 
the function be continuous everywhere. 

Our last example exhibits a function f : [0.1] > 
R which is bounded, but does not admit a maximum: 


ais —3x+sin(1/r) ifa 40 
A ={5 ife —0. 


The supremum of the set { f(x) | « € [0, 1]} is equal 
to 1, but there is no U € [0, 1] with f(U) = 1. 


Proof of Theorem 3.6. We first prove that f(x) is bounded on [a, b]. We suppose 
the contrary: 


(3.10) Yn>1 Aan € [a,b] |f(an)| > n. 


The sequence 21, £2, 13,... admits a convergent subsequence by the Bolzano- 
Weierstrass Theorem (Theorem 1.17). In order to avoid writing this subsequence 
with new symbols, we denote it again by 71, 2, 3, ... and we simply say: “after 
extracting a subsequence, we suppose that” lim,-.oo i = €. Since f is contin- 
uous at €, it follows from Theorem 3.3 that lim, f (an) = f(€). This contra- 
dicts (3.10) and proves the boundedness of f(z). 

In order to prove the existence of U € [a, 6] such that (3.9) holds, we consider 
the set Y = {y; y= f(x), a< x < b}. This set is nonempty and bounded (as we 
have just seen). Therefore, the supremum M = supY exists. By Definition 1.11 
of the supremum, the value / — « (for an arbitrary ¢ > 0) is no longer an upper 
bound of Y. Taking ¢ = 1/n, we thus find a sequence of elements x, € [a, }] 
satisfying 


(3.11) M—-1/n< f(an) <M. 
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Applying the Bolzano-Weierstrass Theorem, after extracting a subsequence, we 

suppose that {2,,} converges and we denote the limit by U = limn_... @n. Be- 

cause of the continuity of f(a) at U, it follows from (3.11) that f(U) = M. 
The existence of a minimum is proved similarly. 


Monotone and Inverse Functions 


(3.7) Definition. Let A and B be subsets of R. The function f : A > B is 


injective if f(a1) A flea) for 14#2x2, 
e surjective if VyEB AxeA f(x)=y, 


increasing if f(a1) < f(w2) for a1<422, 

e decreasing if f(a1) > f(w2) for a1<422, 
nondecreasing if f(ai) < f(ae) for «<2, 
nonincreasing if f(a1) > f(ae) for w< 42, 
monotone if it is nonincreasing or nondecreasing, and 
strictly monotone if it is increasing or decreasing. 


Strictly monotone functions are injective. It is interesting that for real contin- 
uous functions, defined on an interval, the converse statement is true, too. 


(3.8) Lemma. /f f : [a,b] — R is continuous and injective, then f is strictly 
monotone. 


Proof. For any three points u < v < w we have 
(3.12) f(v) is between f(u) and f(w). 


Indeed, suppose f(v) is outside this interval and, say, 
closer to f (uw). Then there is a € between v and w with 
f(u) = f(§) (Theorem 3.5). This is in contradiction : : 
to the injectivity of f. Therefore, fora<c<d<b é os 
the only possibilities are u ye 6 ew 


fla)<fl<fd<f) or — fla) > fle) > F(a) > f(b); 


all other configurations of the inequalities contradict (3.12). 


Surjectivity of a function f : A — B implies that every y € B has at 
least one inverse image. Injectivity then implies uniqueness of this inverse image. 
Therefore, a bijective function has an inverse function f~! : B — A, defined by 


(3.13) fiwen = Fo=x. 


(3.9) Theorem. Let f : [a,b] — [c,d] be continuous and bijective. Then, the 
inverse function f—! : [c,d] — [a,b] is also continuous. 
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Proof. Let {yn} with yn € [c,d] be a sequence satisfying limn—oo Yn = yo. By 
Theorem 3.3, we have to show that limn_... f~'(Yn) = f~+(yo). We therefore 
consider the sequence {x,,} = {f~(yn)}. Let {z’, } be a convergent subsequence 
(which exists by the theorem of Bolzano-Weierstrass), and denote its limit by xo. 
The continuity of f(a) at x implies that 
f(vo) = jim f(a) = lim yn = yo, 

and consequently x9 = f~ ‘(yo). Therefore, each convergent subsequence of 
{an} = {f~1(yn)} converges to f~!(yo). This point is the only accumulation 
point of the sequence { f~!(y,,)} and we have f~1(yn) — f~(yo) (see also Ex- 
ercise 1.8). 


Example. Each of the real functions x”, x?,... is strictly monotone on [0, 00) and 
has there an inverse function: ./Z, wz, .... By Theorem 3.9, these functions are 


continuous. 


Limit of a Function 


The concept of the Jimit of a function was probably first defined with suffi- 
cient rigour by Weierstrass. 
(Pringsheim 1899, Enzyclopddie der Math. Wiss., Band II.1, p. 13) 
Assume that f(a) is not continuous at xo or not even defined there; in such a 
situation it is interesting to know whether there exists, at least, the limit of f(x) 
for x approaching x9. Obviously, x9 has to be close to the domain of f. We say 
that x is an accumulation point of a set A if 


(3.14) YO >O0 3ArEA O< |x—29| <6. 


For a bounded interval, the accumulation points consist of the interval and of the 
two endpoints. 


(3.10) Definition. Consider a function f : A — R and let xo be an accumulation 
point of A. We say that the limit of f (a) at xo exists and is equal to yo, i.e., 


(3.15) lim f(x) = yo 


if 
3.16) Ve>0 4Ad6>0 VaE A: 0< |x—a| <6 | f(x) — yo| <e. 


This definition can be modified to cover the situations 7) = +00 and/or yo = 
too (see, for example, Eq. (1.10)). The assumption that x9 is an accumulation 
point implies that the set of x € A satisfying 0 < |x — xo| < 6 is never empty. 

With Definition 3.10, the continuity of f(a) at vo can be expressed as follows 
(see Definition 3.2): 
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(3.17) lim f(x) exists and lim f(x) = f(a). 

xL—2Xo L>2XLO 
Examples. The function of Fig.3.1b has a limit lim,_.,, f(a) that is different 
from f(x). For the function (3.4), the limit lim,_.,, f(a) exists for all xo (see 


Exercise 3.1; remember that the point 29 is explicitly excluded in Definition 3.10) 
and lims+2, f(a) = 0. 


A still weaker property is the existence of one-sided limits. 


(3.11) Definition. We say that the left-sided (respectively right-sided) limit of f (x) 
at Xo exists if (3.16) holds under the restriction x < Xo (respectively %9 < 2). 
These limits are denoted by 


(3.18) lim f(x) = yo respectively lim f(x) = yo. 


L—Lo— L— Lot 


The functions of Figs. 3.1b, 3.1c, and 3.1d possess left- and right-sided limits 
(often = f(ao)); these limits do not exist for the functions of Figs. 3.le and 3.1f. 
The following theorem is an analog to Cauchy’s criterion in Theorem 1.8. 


(3.12) Theorem (Dedekind 1872). The limit lim, —,., f(x) exists if and only if 
(3.19) 


Ves 0 S8S0 Vnteds OOo 


0<|%—-a9| <6 


If(2) — f(@)| <e. 


Proof. The “only if” part follows from 


If(@) — F@I SIF) — yol + lyo — F(#)| < 2¢. 


For the “if” part we choose a sequence {x;} with x; € A which converges to xo. 
Because of (3.19) the sequence {y;} with y; = f(a;) is a Cauchy sequence and, 
by Theorem 1.8, converges to, say, yo. For an x satisfying 0 < |a — xo| < 6 we 
now have, again from (3.19), 


|f(@) — yol S |f(@) — F(aa)| + |F(@) — yol < 2¢, 


for z sufficiently large. 


Analoguous results hold for the situation where 7 = +00 or for one-sided 
limits. 


Exercises 


3.1 Show that the function (3.4) is continuous at all irrational xo and, of course, 
discontinuous at rational zo. 


3.2 


3.3 


3.4 


3.5 
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Hint. If you have difficulties, set zp = J/2—lande=1 /10 and determine 
for which values of x you have f(a) > ¢. This gives you a 6 for which the 
statement in Definition 3.2 is satisfied. 
(Pringsheim 1899, p.7). Show that Dirichlet’s function (3.3) can be written 
as 

f(z) = lim lim | cos(n!ma)|"". 


noo ™m—>CoO 


Compute the limits 


, v?+3r4+2 . V44+-ua-—-V4-2 
lim ——.———_., lim ————_—_.. 


a>-1 g?-] x0 Qa 
Remember that (\/a — Vb)(./a + Vb) =a—b. 
Show: if f : [a,b] —> [c,d] is continuous at xo, and g : [c,d] — [u, v] is 


continuous at yo = f (20), then the composite function (go f)(x) = g(f(«)) 
is continuous at xo. 


Here is a list of functions f : A > R, 


1) f(#) =a-sin(1/x) — 2x A = (0, 0.2] 
2) f(x) =2/(2? +1) A= [-4, +4] 
3) the same A = (—00, +00) 
4) f(x) = (1/Vsinz) -1 A= (0,7) 
5) the same A=(0,7 
6) f(a) = Vex-sin(2?) A= (0,7 
7) the same A = (0,00) 
8) f(x) =arctan((x — 0.5)/(x? — 0.la—0.7)) A=[-1.5,1.5 
9) f(x) =sin(2?) A= [-5,5 
10) the same A = (—00, 00) 
1) fa) ave A=[-1,1 
12) the same A = (—00, 00) 
13) f(x) =cosx + 0.1 sin(40z) A = [-1.6, 1.6 
14) f(x) =2- [2] A = (0,3 
15) f(x) = /a-sin(1/x) — 2Vx A= [0,0.1 
16) f(a) =3-1//x(1—2) A= (0,1) 
17) f(x) =sin(5/x) — x A = (0,0.4 
where [x] denotes the largest integer not exceeding x. Whenever the above 


definitions for f(x) do not make sense (for example when a certain denomi- 
nator is zero), set f(a) = 0. Decide which of these functions are graphed in 
Fig. 3.3. 
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FIGURE 3.3. Plot of 12 functions for Exercise 3.5 


3.6 Which of the functions of Exercise 3.5 are continuous on A? What are the 
points of discontinuity? 


3.7. Which of the functions of Exercise 3.5 possess a maximum value on A; which 
possess a minimum value on A? 
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11.4 Uniform Convergence and Uniform Continuity 


The following theorem can be found in the work of Mr. Cauchy: “If the 
various terms of the series wo + uw) + U2 +... are continuous functions, .. . 
then the sum s of the series is also a continuous function of x.” But it seems 
to me that this theorem admits exceptions. For example the series 


sin x — 4 sin 2x + i sin 3x ut 
is discontinuous at each value (2m + 1)z of z,... 

(Abel 1826, Oeuvres, vol. 1, p. 224-225) 
The Cauchy-Bolzano era (first half of 19th century) left analysis with two im- 
portant gaps: first the concept of uniform convergence, which clarifies the limit of 
continuous functions and the integral of limits; second the concept of uniform con- 
tinuity, which ensures the integrability of continuous functions. Both gaps were 
filled by Weierstrass and his school (second half of 19th century). 


The Limit of a Sequence of Functions 


We consider a sequence of functions f,, f2, f3,...: A — R. Fora chosen x € A 
the values fi (x), fo(x), fs(a),... are a sequence of numbers. If the limit 
(4.1) Jim, fn(x) = f(a) 


exists for all 2 € A, we say that { f,,(a~)} converges pointwise on A to f(x). 

Cauchy announced in his Cours (1821, p. 131; Oeuvres II.3, p. 120) that if 
(4.1) converges for all x in A and if all f,,(a~) are continuous, then f(x) is also 
continuous. Here are four counterexamples to this assertion; the first one is due to 
Abel (1826, see the quotation above). 


Examples. 
a) (Abel 1826, see the upper left picture of Fig. 4.1) 

in 2 3 nd : 
(4.2a) fa(x) = sing — = ee = == = PA hak Al _ 
Fig. 4.1 shows fi (x), fo(x), f3(x) and fio0(ax). Apparently, { f,, (a) } converges to 
the line y = x/2 for —7 < x < = (this can be proved using the theory of Fourier 
series), but f,(7) = 0 and for 7 < x < 37m the limit is y = «/2 — 7. Thus, the 
limit function is discontinuous. 
b) (upper right picture of Fig. 4.1) 


(4.2b) fp(z)=2" on A=(0,1j, lim fate) = {4 ae 


c) (lower left picture of Fig. 4.1) 


(4.2c) fn(x) = lim fr(z) = 4 0 z= 
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sin2x  sin3z 


2 


y = sinz — 


FIGURE 4.1. Sequences of continuous functions with a discontinuous limit 


d) (lower right picture of Fig. 4.1) 
(4.2d) fr(w) =(1—2?)” on A=[-1,1], lim fr(x) = \\ :, c 
n— oo z—0. 


Another example, which we have already encountered, is f,(x) = arctan(nz) 
(see (3.2)). 


FIGURE 4.2. Sequence of uniformly convergent functions 


Explanation (Seidel 1848). We look at the upper right picture of Fig. 4.1. The 
closer x is chosen to the point z = 1, the slower is the convergence and the 
larger we must take n in order to obtain the prescribed precision ¢. This allows 
the discontinuity to be created. We must therefore require that, for a given e > 0, 
the difference f,,(x) — f(x) be smaller than ¢ for all x € A, if, of course, n > N 
(see Fig. 4.2). 
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(4.1) Definition (Weierstrass 1841). The sequence f, : A — R converges uni- 
formly on Ato f: A> R if 


43) | Yes0 INS>1Vn>N VeEA § |fr(x)—f(a)| <e. 


In this definition, it is important that NV depends only on ¢ and not on zx € A. 
This is why “Va € A” stands after “J N > 1” in (4.3). 

As in Sect. III.1 (Definition 1.7), we can replace f(a) in (4.3) by all succes- 
sors of f,(x). We then get Cauchy’s criterion for uniform convergence: 


(44) Ve>0 IN>1 Vn>N VEDI V@EA |fn(v)—fn+e(2)| <e. 


(4.2) Theorem (Weierstrass’s lectures of 1861). Jf f, : A — R are continuous 
functions and if f(x) converges uniformly on A to f(x), then f : A > Ris 
continuous. 


Xo x 


FIGURE4.3. Continuity of f(x) 


Proof. The idea is to decompose f(a) — f (xo) “in drei Theile ¢; ¢2 £3” and then to 
use an estimate for f,,(a) — fn (ao), and the estimate (4.3) twice (see Fig. 4.3). For 
a given « > 0 we choose N such that (4.3) is satisfied. Since the function fy (x) 
is continuous, there exists a d > 0 such that | f(x) — fr (ao)| < € whenever 
|x — xo| < 6. With the triangle inequality, we thus get for |x — x9| < 6 


|f(@)—f(a0)| S |f(@)— fv (@)| + [fin (@) — fin (@0)| + | fin (0) — F(@0)| < 8¢; 
Ne ee ee 


<eé <€ <eé 


which is arbitrarily small. 


Question. Is there a sequence of continuous functions f,,(x) that converges to a 
continuous function f(x) such that the convergence f,(x) — f(a) is not uni- 
form? As we have seen above, uniform convergence is a necessary hypothesis 
for Theorem 4.2, but it might not be necessary for a particular example. For the 
history of this problem, which occupied many mathematicians between 1850 and 
1880 with numerous attempts and a wrong “proof”, see G. Cantor (1880). 
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First Example (similar to Cantor’s): 


= 2nx 
~ 14 n2x2" 


(4.5) fn(x) 


It can easily be seen that lim, —o. fn(x) = 0 for any fixed « 4 0. The functions 
fn(x) possess a maximum of height y = 1 at 2 = 1/n (see the left-hand picture of 
Fig. 4.4), so the convergence is not uniform. The point is, however, that for x = 0 
all functions f,,(x) are 0. So we have convergence here also, and the limiting 


function is continuous. 


The second example is of a similar nature and still easier to understand (right-hand 


picture of Fig. 4.4): 


nx 0<a<1/n 
(4.6) fn(x) = 4 2-—nex 1/n<a<2/n 
0 2Q/n< a. 


For a third example see Exercise 4.1. 


[n= 50 n=3 n=2 n=l n=50 n=3 n=2 n=1 


7 Oe ee A (RS a es ee 0 


of 1 2 1 


FIGURE 4.4. Nonuniform convergence to a continuous limit 


Weierstrass’s Criterion for Uniform Convergence 


We now consider the important case where the functions are partial sums 
(4.7) 3a) = >> 6) 
with real functions f; : A — R. We call the series 


(4.8) > A(z) 
1=0 


uniformly convergent on A, if the sequence { s,,(a)} of (4.7) converges uniformly 


on A. 
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(4.3) Theorem (Weierstrass’s Criterion). Let 
(4.9) lfn(z)|<cn forall reEA 


and let eae Cn, be a convergent series of numbers; then the series (4.8) con- 
verges uniformly on A. 


Proof. Itis clear from (4.9) that c, > 0. We further have 


ISn4+e(%) — Sn(x)| = |froe(z) +... + fr4i(z)| 
S| freee) | ee ing) Sek Hae Ga | 


The last estimate holds for n > N and all k > 1, because, by hypothesis, the 


series )> cy, converges. The assertion now follows from Cauchy’s Criterion (4.4). 


Examples. a) Since |sin(ma)| < 1 and 5>1/n? is convergent, the series (3.7) 
converges uniformly on R and represents a continuous function. On the other 
hand, Abel’s example (4.2a) needs the divergence of the series 1 + 1/2 + 1/3 + 
1/4+1/5 +... in order that the limit function be discontinuous. 

b) The series for the exponential function, 


e x 
(4.10) e€ SE tog ae Lee 


converges for all x € R, but does not converge uniformly on R (see Fig. I.2.6b). 
In order to apply our theorem nevertheless, we choose a fixed wu and consider 
A = [-u, uJ. Since we know that }>°°_) u”/n! converges and since |x" /n!| < 
u” /n! for |z| < u, we conclude from Theorem 4.3 that the series (4.10) converges 
uniformly on each closed interval |—u, u]. Since u was arbitrary, we obtain that 
e* is continuous for all x € R. 


Uniform Continuity 


It has apparently not yet been observed, that ... continuity at any single 
point ... is not the continuity ... which can be called uniform continuity, 
because it extends uniformly to all points and in all directions. 

(Heine 1870, p. 361) 


The general ideas of the proof of several theorems in §3 according to the 
principles of Mr. Weierstrass are known to me by oral communications 
from himself, from Mr. Schwarz and Mr. Cantor, so that . . . 

(Heine 1872, p. 182) 


Definition 3.2 for the continuity of a function f : A — R ensures for each 7p € A 
and each € > 0 the existence of ad > 0 such that the variation | f(x) — f(xo)| 
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FIGURE 4.5. Nonuniformly continuous functions (a) and (b), uniformly continuous (c) 


is bounded by ¢ if |x — 2o| is bounded by 6. The problem is that this 6 is not 
necessarily the same for all xo € A. 


Examples. Fig. 4.5 shows the graphs of y = 1/2 for A = (0,1] and of y = 2? 
for A = [0, 00). In both cases, it can be observed that the 5, which is necessary 
to ensure that | f(a) — f(xo)| < © for a given ¢, tends to zero, in the first case 
for x 9 — 0, in the second case for x) — oo. On the contrary (Fig. 4.5c), for the 
function y = ,/x on A = [0,1], in spite of the infinite slope of the curve at the 
origin, there is a smallest dyin = €7, which is positive. This din. though usually 
unnecessarily small, can be used throughout the whole interval A = [0,1]. We 
call this property uniform continuity, a notion that emerged slowly in lectures of 
Dirichlet in 1854 and of Weierstrass in 1861. The first publication is due to Heine 
(1870, p. 353). 


(4.4) Definition. A function f : A > R is uniformly continuous on A if 


Ve>0O 4d6>0 Vane A VaEA: |x—29| <6 |f(x) — f(xo0)| <e. 


Remark. The uniform continuity of a given function can often be shown using 
Lagrange’s Mean Value Theorem (see Theorem III.6.11 below), 


(4.11) f(x) — f(@o) = f'(€)(« — 20). 
If A is an interval and f differentiable in A with 
(4.12) M = sup|f'(§)| < co, 
€cA 
then, for a given €, we satisfy the condition of Definition 4.4 by simply putting 


5 = €/M (see also Exercise 4.3 below). However, differentiability is by no means 
necessary, and we have the following astonishing theorem. 
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(4.5) Theorem (Heine 1872). Let A be a closed interval |a, b| and let the function 
f : A— R be continuous on A; then f is uniformly continuous on A. 


First Proof (after Heine 1872, p. 188). We assume the negation of the condition in 
Definition 4.4 and choose 6 = 1/n forn = 1,2,.... This yields 


(4.13a) de>0 V1/n>0 Aan € A Aan, € A: |fn —Lon| <1/n 
(4.13b) such that |f(an) — f(®on)| = €- 


After extracting a convergent subsequence from {,, } (which we again denote by 
{x,,}; see Theorem 1.17), we have limy—.o5 Up, = x, and since |a%p, — Lon| < 1/n 
we also have lim,_..5 Zon = x. Since f is continuous, we have (see Theorem 3.3) 


Jim f(a) = f(e) = lim f(2on), 


in contradiction with (4.13b). 


Second Proof (Liiroth 1873). Let an ¢ > 0 be chosen. For each x € [a, }] let 
5(x) > 0 be the length of the largest open interval I of center x such that | f(y) — 
f(z)| < € for y, z € I. More precisely, 


(4.14) 6(x) =sup{d > 0| Vy, z € [x — 6/2,4+ 6/2] |f(y) — f(z)| < e} 


(where, of course, the values x, y, and z have to lie in A). By continuity of f(z) 
at x, the set {5 > 0| ...} in (4.14) is nonempty, so that d(x) > 0 for all x € A. 
If 6(x) = oo for some x € A, the estimate | f(y) — f(z)| < ¢ holds without any 
restriction and any 6 > 0 will satisfy the condition in Definition 4.4. 


x xt) 2n 
FIGURE 4.6. Liiroth’s proof of Theorem 4.3 


If 6(a) < oo for all x € A, we move x to x + 7. The new interval J’ cannot be 
longer than 5(a) +2|n|, otherwise J would be entirely in J’ and could be extended. 
Neither can it be smaller than 5(2) — 2|7|. Thus, this 6(x) is a continuous function. 
Weierstrass’s Maximum Theorem (Theorem 3.6), applied here in its “minimum” 
version, ensures that there is a value x such that d(ao) < d(a) for all x € A. 
This value 6(xo) is positive by definition and can be used to satisfy the condition 
in Definition 4.4. 


Remark. If you are unsatisfied with both proofs above, you can read a third one, 
published by Darboux (1875, p. 73-74), which is based on repeated subdivision of 
intervals. 
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Exercises 


4.1 


4.2 


4.3 


44 


Show that the functions 
fn(x) = (n+ 1)a"(1—2), xEA=(0,1] 


converge to zero for all « € A, but possess a maximum at 7 = n/(n + 1) 
of asymptotic height 1/e. Therefore, we do not have uniform convergence 
despite the fact that the limiting function is continuous. 


(Pringsheim 1899, p. 34). Show that the series 


=D reales) 


a) converges absolutely for all « € R and 
b) does not converge uniformly on [—1, 1]. 
c) Compute f(x). Is it continuous? 


The function f : [0, 1] — R defined by 
fey ca (sn+ +2) if0<a<1, 
0 ife =0 
is continuous on [0, 1], and should therefore be uniformly continuous. Find 
explicitly for a given e > 0, say « = 0.01,a 6 > 0 for which we have 
Va1,%2 € [0,1] : |ay —22| <6 |f(a1) — f(xa)| <e. 
Hint. Use the Mean Value Theorem away from the origin and a direct esti- 


mate for values close to 0. 


Which of the functions of Fig. 3.3 (see Exercise 3.5) are uniformly continu- 
ous on A? 
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I.5 The Riemann Integral 


Our first question is therefore: what meaning should we give to le f(x) dx? 
(Riemann 1854, Werke, p. 239) 


By one of those insights of which only the greatest minds are capable, 
the famous geometer [Riemann] generalises the concept of the definite 
integral, ... (Darboux 1875) 


The discussion of the integral in Sects. II.5 and II.6 was based on the formula 


b 
(5.1) [ fae = FO - Fo) 


where F'(a:) is a primitive of f(a). We have implicitly assumed that such a primi- 
tive always exists and is unique (up to an additive constant). Here, we will give a 
precise definition of ft f(x) dx independent of differential calculus. This allows 


us to interpret fe f(a) dx for a larger class of functions, including discontinuous 
functions or functions for which a primitive is not known. A rigorous proof of 
(5.1) for continuous f will then be given in Sect. III.6 below. 

Cauchy (1823) described, as rigorously as was then possible, the integral of 
a continuous function as the limit of a sum. Riemann (1854), merely as a side- 
remark in his habilitation thesis on trigonometric series, defined the integral for 
more general functions. In this section, we shall describe Riemann’s theory and 
its extensions by Du Bois-Reymond and Darboux. Still more general theories, not 
treated here, are due to Lebesgue (in 1902) and Kurzweil in 1957. 


General Assumptions. Throughout this section, we shall consider functions f : 
[a,b] — R, where [a,b] = {x|a < x < b} is a bounded interval and f(x) is a 
bounded function, 1.e., 


(5.2) IM>0 Vere [a,b] |f(x)| <M. 


Otherwise, the definition of Darboux sums (below) would not be possible. Situa- 
tions that violate one of these assumptions will be discussed in Sect. III.8. 


Definitions and Criteria of Integrability 


We want to define the integral as the area between the function and the horizontal 
axis. The idea is to divide the interval [a, b] into small subintervals and to approxi- 
mate the area by a sum of small rectangles. A division into subintervals is denoted 
by 


(5.3) Di= {mgs Bi, Boca yaa } 


(where a = 2% < a < ... < %, = D) and the length of a subinterval is 
6; = @j — 1;_1. We then define the lower and upper Darboux sums (see Fig. 5.1) 
by 
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(5.4) s(D)=S fib, S(D) =>) Fb, 
i=1 i=1 
where 
(5.5) fi=_ inf f(e), K= sup f(a). 
Li-1 SUS 2ji wj-1<a"<2; 


Obviously, we have s(D) < S(D) and any reasonable definition of the integral 


ie f(a) dx must give a value between s(D) and S(D). 
A division D’ of |a, b] is called a refinement of D, if it contains the points of 
D,i.e., if D’ > D. 


ei yy wR (Fi fid: 
\ . \ J 


wal ite : 
hf A 
yi Ny 

Lo 6; Lrv6_q 9; Elo Oj 


s(D) S(D) S(D) — s(D) 
FIGURE5.1. Darboux sums 


av 


s(D) s(D’) S(D") S(D) 
FIGURE 5.2. Refinement of a division 


(5.1) Lemma. /f D’ is a refinement of D, then 


s(D) < s(D’) < S(D’) < S(D). 


Proof. Adding a single point to the division D increases the lower Darboux sum 
(or does not change it) and decreases the upper sum (or does not change it, 
Fig. 5.2). Repeated addition of points yields the statement. 
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(5.2) Lemma. Let D, and D2 be two arbitrary divisions, then 
s(D;) < S(Dg). 
Proof. We take D’ = D, U Dg, the division containing all points of the two 


divisions (points appearing twice are counted only once). Since D’ is a refinement 
of D, and of Do, the statement follows from Lemma 5.1. 


Lemma 5.2 implies that, for a given function f : [a,b] > R, the set of lower 
Darboux sums is majorized by every upper Darboux sum (and vice versa): 


s(D) S(D) 
6.6) SHH} HHH 
? 


Therefore (Theorem 1.12), it makes sense to consider the supremum of the lower 
sums and the infimum of the upper sums. Following Darboux (1875), we introduce 
the notation 


b 
(5.7) a f(x) dx = inf S(D) the upper integral, 
A 
(5.8) / f(x) dx = sup s(D) the lower integral. 
a D 


(5.3) Definition. A function f : [a,b] — R, satisfying (5.2), is called integrable (in 
the sense of Riemann), if the lower and upper integrals (5.7) and (5.8) are equal. 
In that case, we remove the bars in (5.7) and (5.8) and we obtain the “Riemann 
integral”. 


(5.4) Theorem. A function f : [a,b] > R is integrable if and only if 
(5.9) Ve>0 4D S(D)-s(D) <e. 


Proof. By definition, the function f(a) is integrable if and only if the two sets 
in (5.6) are arbitrarily close. This means that, for a given « >_0, there exist two 
divisions D, and Dp such that S(D2) — s(D1) < ¢. Taking D = D, U Dp and 
applying Lemma 5.1 yields the statement. 


(5.5) Example. Consider the function f(#) = x on an interval [a,b]. For the 
equidistant division D, = {%; =a+ih|i =0,1,...,n, h = (b—a)/n}, we 
obtain from (1.1.28) that 
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so that $(D,,) — s(D,) = (b — a)?/n. For sufficiently large n this expression 
is smaller than any ¢ > 0. Therefore, the function is integrable and the integral 
equals b?/2 — a?/2. 


(5.6) Example. Dirichlet’s function f : [0,1] — R, defined by (see (3.3)) 


f(a) = { 1 x rational 


0 x irrational, 


is not integrable in the sense of Riemann, because in every subinterval there are 
rational and irrational numbers so that f; = 0 and F; = 1 for all 2. Consequently, 
s(D) = 0, S(D) = 1 for all divisions. 


(5.7) Example. The function f : [0,1] — R (see (3.4)) 


0 x irrational or x = 0 
f(z) = 2 ; 
1/q x = p/q reduced fraction 
is discontinuous at all positive rational 7. However, for a fixed € > 0, only a finite 
number (say k) of x-values are such that f(a) > ¢. We now choose a division D 


with max; 6; < e/k, such that the x-values for which f(a) > ¢ lie in the interior 
of the subintervals. Because of f(a) < 1, this implies 


S(D) <e+k-max6d; < 2e. 


Since s(D) = 0, we see that our function is integrable and that fo f(x) dx = 0. 


The Theorem of Du Bois-Reymond and Darboux. 


I feel, however, that the manner in which the criterion of integrability was 
formulated leaves something to be desired. 
(Du Bois-Reymond 1875, p. 259) 


(5.8) Theorem (Du Bois-Reymond 1875, Darboux 1875). A function f(x), satis- 
fring (5.2), is integrable if and only if 


Ve>0 46>0 VDEDs S(D)—s(D) <e. 


Here, Ds denotes the set of all divisions satisfying max; 6; < 6. 


Proof. The “if” part is a simple consequence of Theorem 5.4. The difficulty of the 
“only if” part resides in the fact that the division D, about which we know nothing 
but max; 6; < 6, can be quite different from the D of Theorem 5.4. 

Let c« > 0 be fixed and let D be a division satisfying (5.9), i.e., the shaded 
area A = S(D) — s(D) in Fig. 5.3a is smaller than ¢. The important point is that 
D= {Xo,%1,...,£,} consists of a finite number of points. Now take an arbitrary 
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b) 


DUD Xo Ly A) see T11 
FIGURE 5.3. Du Bois-Reymond and Darboux’s proof 


division D € Ds (see Fig. 5.3b) and set A = S(D) — s(D). We have to prove that 
A becomes arbitrarily small if 6 — 0. 

Consider the union D’ = DU D of the two divisions and set A’ = S(D’) — 
s(D’) (see Fig.5.3c). The Darboux sums for D’ and D are equal everywhere, 
except on intervals that contain points of D (Fig. 5.3d). Since we have at most 
nm — 1 such intervals, since their length is < 6, and since —M < f(x) < M, we 
have 


(5.10) A< Al +2(m-1)6M. 


Together with A’ < A<e (observe that D’ is a refinement of D), this estimate 
yields A < 2c provided that 6 < ¢/(2(n — 1)M). Oo 
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Riemann Sums. Consider a division (5.3) and let €), €2,...,&€, be such that xp < 
&) Say <p Sag < &3 <.... Then, we call 


n 


(5.11) o => FG) +4 


i=l 


a Riemann sum. Because of (5.5), we have f; < f(&) < Fi, so that s(D) <a < 
S(D). Theorem 5.8 thus implies that 


n 


b 
(5.12) SS" f(&) 5:5 / f(w)de if maxd; = 0, 


i=1 


provided that f : [a.b] > R is an integrable function. 
Riemann sums are very convenient for proving properties of the integral. For 
example, the limit max; 6; — 0 of the trivial identity 


n 


Siler fi(&) + cafolEi)) 6 = er DF AiG) - 5; + 2 DO fal&i) - 5 


i=l i=l i=l 


leads to (II.4.13), if the functions involved are integrable. 


Integrable Functions 


Let us investigate which classes of functions are integrable. 


(5.9) Theorem. Let f and g be two integrable functions on |a, b| and let \ be a 
real number. Then the functions 


f+g, Af, fg Ifl fo Gf lg(e)|20>0) 


are again integrable. 


Proof. We shall use throughout the proof the fact that F; — jf; represents the least 
upper bound for the variations of f(a) on [#;~1, xi], ie., 
(5.13) sup [f(e) -f@)l=Fi- fe 
wy (e510: 

Indeed, suppose that ¢ > 0 is a given number. By the definition of F; and /;, there 
exist €,7 © [a;-1,2;] such that f(€) > F; —e, f(m) < fi + and therefore 
f(€) — f(m) > Fi — fi — 2e. Consequently, F; — f; is not only an upper bound 
for | f(x) — f(y)|, but also the /east upper bound. 

a) Let h(x) = f(x) + g(a), and denote by F;, G;, Hj, respectively, fi, gi, 
h,, the supremum, respectively, infimum of f, g, h, on [z;~1, v;] (see (5.5)). We 
then have for x, y € [x;_1, x], using the triangle inequality and (5.13), 


|h(x) — h(y)| < |f(@) — F(y)| + lo(@) — 9) 
(5.14) = = fa) + (Ge 9:): 
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Thus, Eq. (5.13), applied to the function h, shows that (H; — hi) < (Fi — fi) + 
(G; — g;), and the differences of the upper and lower Darboux sums satisfy 


(5.15) S (Hi — hid; < SOF ers ae SoG — 91)0i- 


For a given € > 0 we choose a division D (Theorem 5.4) such that each term in 
the sum on the right side of (5.15) is smaller than ¢ (in fact, we have two different 
divisions for f and g, but by taking their union we may suppose that they are the 
same). Consequently, )°,(H; — hi)d; < 2e and the function h(x) = f(x) + g(z) 
is integrable by Theorem 5.4. 

b) The proofs of the remaining assertions are very similar. For example, for 
h(a) = - f(x) we use 


|h(a) — h(y)| = |Al- lf) — FY) 


instead of (5.14), conclude that (H;—h;) < |A|-(Fi—f;), and deduce integrability 
as above. 
For the product h(a) = f(x) - g(x) we use 


|h(x) — h(y)| < If (x) - lo(@) — g@)1 + lo) |f(@) — F@)I 
<M -|g(x)—gly)| +N -|f(z) - fy) 


(both functions f(x) and g(x) are bounded by assumption (5.2)). 

Finally, for the last assertion it suffices to prove that 1/g(x) is integrable 
(because f(x)/g(a) = f(x) -(1/g(a)). We set h(x) = 1/g(x) and replace (5.14) 
by 


Since the constant function and f(x) = «x are integrable (Example 5.5), the 
above theorem implies that polynomials and rational functions (away from sin- 
gularities) are integrable. The following theorem was asserted by Cauchy (1823), 
but was proved rigorously only some 50 years later with the notion of uniform 
continuity. 


(5.10) Theorem. /f f : [a,b] > R is continuous, then it is integrable. 


Proof. The essential point is that f is uniformly continuous (Theorem 4.5). This 
means that for a given € > 0 there exists a 6 > 0 such that 


Ir-—yf<d => F(a) -— FM) <e. 


We take a division D satisfying max; 6; < 6. For x,y € [a;~-1, x;] we thus have 
| f(a) — f(y)| < € and, by (5.13), F; — f; < e. This implies that S(D) — s(D) = 
(A — fi)di < e SL, 6; = e(b — a) and the integrability of f(x) follows 
from Theorem 5.4. 
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(5.11) Theorem. /f f : [a,b] — R is nondecreasing (or nonincreasing), then it is 
integrable. 


Proof. The smallest value of a nondecreasing function is at the left end point and 
the largest at the right end point of the interval [7;_1,7;]. Hence, f; = f(ai-1), 
F, = f(a;) so that fi, = F; for? = 1,...,n— 1. The idea is now to consider 
equidistant divisions where the length of all subintervals is equal to 6. We then 
have 


So (Fi-fié = F\6— fd+Fod— fod+F36— fgd+...= (f(@n)—f(#0))-6 <6, 


if 6 is sufficiently small. This proves the integrability of f(z). 


(5.12) Remark. If we change an integrable function at a finite number of points, 
the function remains integrable and the value of the integral does not change. This 
is seen by an argument similar to that of Example 5.7 above. 


(5.13) Remark. Let a < b < cand assume that f : [a,c] — R is a function whose 
restrictions to [a, b] and to [b,c] are integrable. Then f is integrable on [a,c] and 
we have 


Cc b c 
(5.16) i flo) dx = | fo) de+ f f(a) da. 
a a b 
This holds because adding the Darboux sums for the restrictions to [a, b] and [, c] 


yields a Darboux sum for [a, c]. 
For a > b or a = b we define 


(5.17) [so dz = — [ f(a)dx and a f(a) dx =0, 


so that Eq. (5.16) is true for any triple (a, b, c). 


Inequalities and the Mean Value Theorem 


The following inequalities are often useful for estimating integrals. We have al- 
ready used them in Sect. II.10 to obtain the estimates (II.10.15). 


(5.14) Theorem. If f(a) and g(x) are integrable on |a,b| (with a < b) and if 
f(x) < g(a) for all x € [a,b], then 


[sears f otoae 


Proof. The Riemann sums satisfy 37>; f (€:)di < S>7_1 9(&)6i, because 6; > 0. 
For max; 6; — 0 we obtain the above inequality (see (5.12) and Theorem 1.6). 
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(5.15) Corollary. For integrable functions we have 


Lf steyae| sf isteyiae 


Proof. We apply Theorem 5.14 to —|f(x)| < f(a) < |f(x)]. 


By applying Corollary 5.15 to a product of two integrable functions f(x) - 


g(x) and using | f(x) - g(x)| < M - |g(@)), = SUPre{a,v] [f(#)|, we 
obtain the following useful estimate: 


(5.18) if f(a) + g(x) del < su Le ef Ig(a)| der. 


The next inequality is similar to (5.18), but treats the two functions f and g sym- 
metrically. 


(5.16) The Cauchy-Schwarz Inequality (Cauchy 1821 in R”, Bunyakovski 1859 
for integrals, Schwarz 1885, §15, for double integrals). For integrable functions 


f(x) and g(x) we have 
b b 
<yff Peary [ ee)ar 


Proof. By Theorem 5.9, we know that f - g, f?, and g? are integrable. Using 
Theorem 5.14 and the linearity of the integral, we have 


o< f (se) ro(@))' ae 
-[ Pe Jae —2y fsa x) dx +7? [rou 


We put A = ie f(z) dz, B= ib f(x)g(a) dz, C = Hh g°(a) dx, and we see 
that A — 27B + 7 > 0 for all real +. ee ce = 0 this irae that B = 0. For 
C # 0 the discriminant of the quadratic equation cannot be positive (see (I.1.12)). 
Therefore, we must have B? < AC, which is (5.19). 


(5.19) | | * Fle)g(x) de 


(5.17) The Mean Value Theorem (Cauchy 1821). If f : [a,b] — R is a continu- 
ous function, then there exists € € [a, b] such that 


b 
(5.20) / f(a) de = f(€)-(b- a). 


Proof. Let m and M be the minimum and the maximum of f(z) on [a, b] (see 
Theorem 3.6), so thatm < f(a) < M for all x € [a,b]. Applying Theorem 5.14 
and dividing by (b — a) yields 
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b 
—/ f(a) da < M. 


ms 


The value oe f(x) dx/(b—a) lies between m = f(u) and M = f(U). Therefore, 
by Bolzano’s Theorem 3.5, we deduce the existence of € € [a,b] such that this 
value equals f(€). This proves Eq. (5.20). 


(5.18) Theorem (Cauchy 1821). Let f : [a,b] — R be continuous and let 
g : [a,b] — R be an integrable function that is everywhere positive (or every- 
where negative). Then, there exists € € [a,b] such that 


b b 
(5.21) [fester = © [oleae 
Proof. Suppose that g(a) > 0 for all x (otherwise replace g by —g). In this situa- 
tion, we have 
m- g(a) < f(a)g(a)<M-g(a) for x € [a,b], 


where m and M are the minimum and maximum of f(a). The rest of the proof is 
the same as for the Mean Value Theorem. 


Integration of Infinite Series 


Until very recently it was believed, that the integral of a convergent se- 
ries ... is equal to the sum of the integrals of the individual terms, and 
Mr. Weierstrass was the first to observe .. . 
(Heine 1870, Ueber trig. Reihen, J.f. Math., vol. 70, p. 353) 
On several occasions we found it useful to integrate an infinite series term by 
term (e.g., in the derivation of Mercator’s series (1.3.13) and in the examples of 
Sect. II.6). This means that we exchanged integration with a limit of functions. We 
will discuss here under what conditions this is permitted. 


First Example. Let r,,172,73,74,... be a sequence containing all rational num- 
bers between 0 and 1, for example 


Le the 22 2» 8, hee 3 a 
De Bon PNM A Als Bet Ai Pt Reni ake ® 
We then define 
1 if x € {r1,72,73,---,Tn 
0 else. 


By Remark 5.12, each function f,, : [0,1] — R is integrable with integral zero. 
However, the limit function f(a), which is Dirichlet’s function of Example 5.6, is 
not integrable. (The Lebesgue integral will get rid of this difficulty.) 
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Second Example. The graphs of the functions ’ 
4h n= 


nx O0<a<1I1/n 
(5.23) fn(v) = 4 2n-— ne 1l/n<au<2/n 
0 2/n<a <2 3-| 3 


are triangles with decreasing bases and increasing 
altitudes with the property that 


2 
| fn(z) dz =1 forall n. 
0 


However, the limit function is f(z) = 0 for all 
x € [0,1]. Here, f(a) is integrable, but 


2 
lim tl ode ¢ f lim fn(x) da. " ‘ 
0 n—-co 


n—-Co 


(5.19) Theorem. Consider a sequence f,,(x) of integrable functions and suppose 
that it converges uniformly on |a,b] to a function f(x). Then f : [a,b] — R is 
integrable and 


lim tle Jar = [f(a de 


Proof. Uniform convergence means that, for a given ¢ > 0, there exists an integer 
N such that for all n > N and for all x € [a,b] we have |f,(x) — f(x)| < e. 
Consequently, we have for all x, y € [a, b] that 


If(z) — FY) < lf (x) — fr(y)| + 2¢. 
Applying (5.13), we see that 


(Fy — fa) < (Pi — fini) + 28; 


where, as in (5.5), we have used the notation Fy; = sup,, ,<r<,, fw(#) and 
fui = infz,_,<2<2, fw(x). The function fy(x) is integrable, so that for a 
suitable division of [a,b] the difference of the upper and the lower Darboux 
sums, i.e., )> (Fini — fni)di, is smaller than ¢ (Theorem 5.4). This implies that 
(Fi — fi)di < e(1 + 2(b — a)) and f(x) is seen to be integrable. 

Once the integrability of the limit function f(a) is established, Corollary 5.15 
implies that forn > N 


| tutwyae— f° 102) < [lfule) ~ £0) de <(0~ a) 


This implies the conclusion of the theorem. 
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(5.20) Corollary. Consider a sequence f,,(x) of integrable functions and suppose 
that the series \\~  fn(a) converges uniformly on [a, b|. Then, we have 


ae ay — 
ey Sula) de = f Doe Oo 


FIGURE5.4. Riemann’s example of an integrable function 


Riemann’s Example. 
Since these functions have never been considered yet, it will be useful to 
start from a particular example. (Riemann 1854, Werke, p. 228) 
Riemann (1854), in order to demonstrate the power of his theory of integration, 
proposed the following example of a function that is discontinuous in every inter- 
val (see Fig. 5.4): 


B(nx) 


where 


9 


(5.24) f(e)=>> Be) = {5 ©) if x # k/2 


0 ifx = k/2 
and (x) denotes the nearest integer to x. This function is discontinuous at « = 
1/2, 1/4, 3/4, 1/6, 3/6, 5/6,..., nevertheless, the series (5.24) converges uni- 


formly by Theorem 4.3 and the functions f,,(a) are integrable by Remark 5.13. 
Hence, f is integrable. 


Exercises 


5.1 For the function 


x otherwise 


5.2 


5.3 


5.4 


5.5 


5.6 


I.5 The Riemann Integral 233 


and a given é > 0, say ¢ = 0.01, construct explicitly a division for which 
S(D) — s(D) < e. This will make clear that f is integrable in the sense of 
Riemann. 


Consider the function f(a) = 2? on the interval [0, 1]. Compute the lower 
and upper Darboux sums for the equidistant division x; = i/n, i = 
0,1,...,7. Conclude from the results obtained that f is integrable. 


Show that the numerical approximations obtained from the trapezoidal rule 
(see Sect. II.6), 


ie ) der = h(LE2 + Fe) + HG) + HG) +--+ F(Eva) + LY) 


= (b—a)/N and €; = a+ ih), as well as for Simpson’s rule (N even), 


| f(a) dr = = ( F(€o) + 4F (Gr) + 2 (G) + 4F(G) +--+ FG); 


are Riemann sums for a certain division D. Therefore, convergence of these 
methods is ensured for N — oo for all Riemann integrable functions. 


(Dini 1878, Chap. 13). Show that 


| In(1 — 2acos x + a”) dx=0 fora? <1, 
0 


| In(1 — 2a cos x + a”) dz =nlna’ fora? > 1, 
0 


by computing Riemann sums for an equidistant division 7; = im/n, with &; 
the left end point x;_;. The Riemann sums will become the logarithm of a 
product with which we are familiar (see Sect. I.5). 

Let f : [a,b] — R satisfy i) f is continuous, ii) Vaz € [a,b] we have 
f(x) > 0, and iti) dao € (a,b) with f(xo) > 0. Then, show that 


b 
(5.25) / f(x) dx > 0. 


Show with the help of counterexamples that each of the three hypotheses i), 
ii), and iii) is necessary for proving (5.25). 
Compute the integrals 


= ee One). 
Then, use 0 < sina < 1 for0 < « < 7/2 and Theorem 5.14 to establish 


n/2 m/2 n/2 
| sin?” x dx > | sin2?”t+! ¢ dx > | sin?”*? x dx. 
0 0 0 
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The above values inserted into these inequalities lead to a proof of Wallis’s 
product (1.5.27) with a rigorous error estimate. 


5.7 Show that 
1 1 1 4 1 Pas 4 1 
>| a4(1—a)* dx < | Ce ay zy x*(1—2x)* dz. 
2 Jo 9 l+2? 0 
The actual computation of these integrals leads to an amusing result (old 
souvenirs from Sect. I.6). 
Hint. To calculate fo x*(1 — x)* dx see Exercise II.4.3. 
5.8 Show that the series 
1 
—~ = 1- a? +a¢4- 26 +28 -... 
1+ a? 
converges uniformly on A = [—), b] for each b with 0 < b < 1. Hence, this 
series can be integrated term by term on A = (0, }] (or on A = [—b, 0]) and 
leads to the well-known series for arctan b. 
a) 


5.9 


0 1 


FIGURES.5. Exchange of lim and integral 


For the following sequences of functions f,, : [0,1] — R (Fig.5.5), 


NL na 


a) fn(x) = Cane b) fn(x) = C+ ie’ 


compute lim... fn(x) (distinguish the cases x = 0 and x # 0). Find the 
maximal point of f,,(a) and decide whether convergence is uniform. Finally, 
check whether the following equality holds: 


1 1 
lim fn(2) a= f lim f(x) da. 


n—-Co 0 n—-oco 
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... rigor, which I wanted to be absolute in my Cours d’analyse, ... 
(Cauchy 1829, Lecons) 


The total variation f(z +h) — f(x)... can in general be decomposed into 
two terms... (Weierstrass 1861) 


The derivative of a function was introduced and discussed in Sect. II.1. Now that 
we have the notion of limit at our disposal, it is possible to give a precise definition. 


(6.1) Definition (Cauchy 1821). Let I be an interval and let xp € I. The function 
f :1I— Ris differentiable at xo if the limit 


(6.1) f'(zo) = lim f(x) — f(zo) 


L—>xO x — Xo 


exists. The value of this limit is the derivative of f at xo and is denoted by f' (x0). 
If the function f is differentiable at all points of I and if f’ : I > R is 
continuous, then f is called continuously differentiable. 


Sometimes it is advantageous to write 7 = xq + h, so that 


' 4. (to +h) — f(xo) 
(6.2) f' (xo) = lim a ae 


One can also, for a given zo, consider the function r : J > R defined by r(x) = 
0 and 


(6.3) r(x) = fla) = F(@0) _ f' (x0) for x # x. 

wv — XO 
Then, Eq. (6.1) is equivalent to lim;_.,, r(x) = 0 and we have the following 
criterion. 


(6.2) Weierstrass’s Formulation (Weierstrass 1861, see the above quotation). A 
function f(x) is differentiable at xo if and only if there exists a number f’ (xo) 
and a function r(x), continuous at x9 and satisfying r(2o) = 0, such that 


(6.4) f(x) = f(xo) + f'(@0)(@ — #0) + r(@)(a — 20). 


Equation (6.4) has the advantage of containing no limit (this is replaced 
by the continuity of r(a)) and of exhibiting the equation of tangent line y = 
f (xo) + f’(xo)(x — xo) to f(x) at x = xo. Moreover, it will be the basis for the 
differentiability theory of functions of several variables. 

Still simpler formulas and proofs are obtained, if the two terms in Eq. (6.4) 
are collected by setting 


(6.5) p(x) = f'(#o) + 7r(2). 
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(6.3) Carathéodory’s Formulation (Carathéodory 1950, p. 121). A function f(z) 
is differentiable at xo if and only if there exists a function y(x), continuous at xo, 
such that 


(6.6) f(@) = f(@0) + p(2)(# — 20). 


The value (29) is the derivative f’(2o) of f at xo. 
We see immediately from (6.6) that if f is differentiable at xo, then it is also 
continuous at 79. Furthermore, since from (6.5) and (6.3) (or directly from (6.6)) 


f(x) = f (xo) 


«L— XO 


(6.7) p(x) = for x 4 x 
is uniquely determined for x 4 xo, the derivative f’(xo) is uniquely determined 
if it exists. 


Remarks and Examples. 1. Obviously, the functions f(a”) = 1 and f(x) = x are 
differentiable. The differentiability of f(x) = x? follows, for example, from (6.6) 
with the identity 2? — 2% = (x + x0)(a — xo) (see also Sect. II.1). 

2. We emphasize that differentiability at x is a local property. Changing the 
function outside (7 —¢, x9 +¢) for some ¢ > 0 changes neither its differentiability 
at xo nor the derivative f’ (a). 

3. If I = [a,b] is a closed interval and ao = a, then (6.1) should be replaced 
by the right-sided limit. 

4. Consider the function f(x) = |2| (absolute value). At zp > 0, it is 
differentiable with f’(vo) = 1; at a9 < 0 it is also differentiable, but with 
derivative f’(xo) = —1. This function is not differentiable at xo = 0, because 
f(a) /x = |x|/x does not have a limit for x — 0. 

5. The function 


f(a) 0 if x is irrational or integer 

oP = het /q° if x = p/q (reduced fraction) 
is discontinuous at every non-integer rational xo. It 
is, nevertheless, differentiable at x) = O, since the 
function v(x) of Eq. (6.6) becomes v(x) = f(x) /a. 


Since | f(x)| < |z|?, we have lim, 9 p(x) = 0 and 
f'(ao) = 0. 

(6.4) Theorem. /f f : (a,b) — R is differentiable at xo € (a,b) and f'(ao) > 0, 
then there exists 6 > 0 such that 


f(x) > f(ao) forall x satisfying x9 < «<4 +6, 
f(x) < f(ao) forall x satisfying x9 —6 <u < x0. 


If the function possesses a maximum (or minimum) at xo, then f'(xo) = 0. 


in Ante th Aide Ait 


Proof. f'(xo) > 0 means that y(a%o) > 0 (see (6.6)). By continuity, p(x) > Oina 
neighborhood of 29. Now the stated inequalities follow from (6.7). 
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If the function possesses a maximum at 2p, then we have f(x) < f(x) on 
both sides of 29. This is only possible if f’(ao) = 0. 


(6.5) Remark. The statement of Theorem 6.4 does not imply that a function, sat- 
isfying f’(ao) > 0, is monotonically increasing in a neighborhood of xo. As a 
counterexample, consider the function f(a) (see Fig. 6.1), given by f(0) = 0 and 


f(a) = 2 +27 sin(1/2x”) for «#0. 


It is differentiable everywhere and satisfies f’(0) = 1 (because f(x) = x+r(x)-ax 
with |r(x)| < |a|). For « 4 0 the derivative 


f(a) =1+ 2a sin(—) Be cos(—) 
x z x 

oscillates strongly near the origin. Hence, even though f (2) is contained between 
two parabolas, there are points with negative derivatives arbitrarily close to the 
origin. By Theorem 6.4, there exist points €; < &, arbitrarily close to 0, for 
which f(€1) > f(€2). 

We shall show later (Corollary 6.12) that, if f’(x) > 0 for all x € (a,b), the 
function is monotonically increasing. Thus, this counterexample is only possible 
because f is not continuously differentiable. 


mm 


FIGURE 6.1. Graph of the function y = 2 + x” sin(1/2”) and its derivative 


(6.6) Theorem. /f f and g are differentiable at xo, then so are 


f+g9, f-9, f/a (i 9(xo) #9). 


The formulas of Sect. II.1 for their derivatives are correct. 


Proof. We shall present two different proofs for the product f - g. For f + g and 
f /g the proofs are similar. 
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The first proof is based on the identity 


Xt — XO wr — XO w— XO 
which is obtained by adding and subtracting the term f()g(ao). Using the 
continuity of f at x (Eq.(6.4)), the differentiability of f and g, and Theorem 
1.5, we see that for z — zg the expression on the right has the finite limit 
f(xo)g' (ao) + g(%0)f’ (ao). Hence, the product f - g is differentiable at xo. 
Our second proof is based on Carathéodory’s formulation 6.3. By hypothesis, 
we have 


f(x) = f(xo) + v(a)(x — x0), y(xo) = f' (xo), 
g(x) = g(xo) + ¥(x)(x — x0), W(x0) = g'(xo). 


We multiply both equations of (6.8), and obtain 


(6.8) 


F(a)g(a) = F (wo) 920) + (feo) + 90) ole) +9: (w= a0)) (w= a0). 


The function in tall brackets is evidently continuous at x9 and its value for x = xo 
is f(xo)g9'(o) + g(%0) f" (20). 


(6.7) Theorem (Chain rule for composite functions). [fy = f(a) is differentiable 
at xo and if z = g(y) is differentiable at yo = f (xo), then the composite function 
(g0 f)(x) = g(f(a)) is differentiable at xo, and we have 


(6.9) (9° f)'(wo) = 9'(yo) - f’(x0). 
Many of our students will appreciate the pithy elegance of this 
proof. (Kuhn 1991) 
Proof. We use Eq. (6.6) to write the hypothesis in the form 
f(x) — f(wo) = (a)(w@— 20), (to) = f"(x0), 
9(y) — 9(yo) = ¥(y)(y— yo), — ¥(Yo) = g' (yo). 


Inserting y— yo = f(x) — f(xo) from the first equation into the second, we obtain 


g(f(x)) — 9(F(20)) = ¥(F(2)) 9(@)(@ — 20). 


The function ¢(f(x)) v(x) is again continuous at xo, and its value for x = xo is 


9! (f(a0)) + f'(x0). 


(6.8) Theorem (Inverse functions). Let f : I — J be bijective, continuous, and 
differentiable at xo € I, and suppose that f'(a9) 4 0. Then, the inverse function 
f-!: J — Lis differentiable at yo = f (xo), and we have 


1 


(6.10) (FY Wo) = Fo 
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Proof. In Carathéodory’s formulation (6.6), we have by hypothesis 


f(@) — f(vo) = e(@)(e@- 20), — p(%o) = f"(a0), 


we replace x and x by f~*(y) and f~(yo), and f(a) and f (xo) by y and yo, 
and get 


y — yo = o(f-'(y)) (F-*(y) — F7* (yo). 


From the proof of Theorem 3.9 it follows that f~'(y) is continuous at yo. Be- 
cause by hypothesis y(f~1(yo)) 4 0, we therefore have y(f~'(y)) A Oina 
neighborhood of yo and we may divide this formula to obtain 


f7'(y) — £7" (yo) = ) (y — Yo): 


eee 
g(f-y 


This concludes the proof, since the function 1/(f~1(y)) is continuous at yo. 


The Fundamental Theorem of Differential Calculus 


Formula (11.4.6) is the central result of all the computations of Sect. II.4. We shall 
give here a rigorous proof of this result. In particular, we shall show that every 
continuous function f(a) has a primitive, which is unique up to an additive con- 
stant. 


(6.9) Theorem (Existence of a primitive). Let f : [a,b] — R be a continuous 
function. The function 


(6.11) F(x) = i f(t) dt 


(which exists by Theorem 5.10) is differentiable on (a,b) and satisfies F’(x) = 
f(x). Hence, it is a primitive of f(x). 


Proof. By Eq. (5.16), we have 


(6.12) F(x) — F(xo) = i f(t) dt. 
Applying the Mean Value Theorem 5.17, we get 
F(a) — F(vo) = f(€)(@ — xo), 


where € = €(x, Zo) lies between x and xo. For x — po the value €(x, x9) neces- 
sarily tends to xo, so that by continuity of f at xo, we have lim,.., f(€) = f (20). 
This proves (see (6.6)) the differentiability of F(a), with F’(ao) = f (ao). 
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Uniqueness of Primitives. 


This was supplied by the mean value theorem; and it was Cauchy’s great 
service to have recognized its fundamental importance. .. . because of this, 
we adjudge Cauchy as the founder of exact infinitesimal calculus. 

(F. Klein 1908, Engl. ed. p. 213) 


See the beautiful proof of this theorem due to Mr. O. Bonnet, in the Traité 
de Calcul différentiel et intégral of Mr. Serret, vol. 1, p. 17. 
(Darboux 1875, p. 111) 
Our next aim is to prove the uniqueness (up to an additive constant) of the primi- 
tive. The following concatenation of theorems, which accomplishes this task, has 
been one of the cornerstones of the foundations of Analysis since Serret’s book 
(1868; Serret attributes these ideas to O. Bonnet; see the quotations). 


(6.10) Theorem (Rolle 1690). Let f : [a,b] — R be continuous on |a, b], differ- 
entiable on (a,b), and such that f(a) = f(b). Then, there exists a € € (a,b) such 
that 


(6.13) f'(6)=0. 


Proof. From Theorem 3.6, we know there exist u,U € [a,}] such that f(u) < 
f(a) < f(U) for all x € [a, 6]. We now distinguish two situations. 
If f(u) = f(U), then f(x) is constant and its derivative is zero everywhere. 
If f(u) < f(U), then at least one of the two values (say f(U)) is different 
from f(a) = f(b). We then have a < U < b, and by Theorem 6.4, f’(U) = 0. 


(6.11) Theorem (Lagrange 1797). Let f : [a,b] — R be continuous on |a, b] and 
differentiable on (a,b). Then, there exists a number € € (a,b) such that 


(6.14) f(b) — fla) = f'(E)(b - a). 


Oo 


Lad fw 


a b ala b 
FIGURE 6.2. Proof of Rolle’s and Lagrange’s Theorems 


Y 


Proof. The idea is to subtract from f(a) the straight line connecting the points 


(a, f(a)) and (b, f(b)), of slope (f(b) — f(a))/(b — a), and to apply Rolle’s 
Theorem (Fig. 6.2). We define 
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(6.15) h(a) = f(e) — (Fa) +(@ S AUCs 
Because of h(a) = h(b) = 0 and 


Eq. (6.14) follows from h’(€) = 0 (Theorem 6.10). 


(6.12) Corollary. Let f,g : [a,b] — R be continuous on |a, b] and differentiable 

on (a, b). We then have 

a) if f’(€) =O forall € € (a,b), then f(x) = C (constant); 

b) if f'(E) =9'(€) forall € € (a,b), then f(x) = g(x) +C; 

c) if f’(€) > 0 for all € © (a,b), then f(x) is monotonically increasing, i.e., 
f(r1) < f(x2) fora < x21 < rq < b; and 

d) if |f'(g)| < M forall € (a,b), then |f(x1) — f(v2)| < May — x2 for 


£1, 2 € [a, b). 


Proof. Applying Eq. (6.14) to the interval [a,x] yields statement (a) with C = 
f(a). Statement (b) follows from (a). The remaining two statements are obtained 
from Theorem 6.11 applied to the interval [21, x2]. 


(6.13) The Fundamental Theorem of Differential Calculus. Let f(x) be a con- 
tinuous function on [a, b|. Then, there exists a primitive F(x) of f(x), unique up 
to an additive constant, and we have 


b 
(6.16) i. f(a) dx = F(b) — F(a). 


Proof. The existence of F(a) is clear from Theorem 6.9. Uniqueness (up to a 
constant) is a consequence of Corollary 6.12b. If F'() is an arbitrary primitive of 
f(a), then we have F(x) = J” f(t) dt + C. Setting x = a yields C = F(a), and 
Eq. (6.16) is obtained on setting x = b. 


Fig.6.3 shows the impressive genealogical tree of the theorems that are 
needed for a rigorous proof of the fundamental theorem. If Leibniz had known 
about this diagram, he might not have had the courage to state and use this theo- 
rem. 

The “Fundamental Theorem of Differential Calculus” allows us to formulate 
theorems of Differential Calculus (Sect. III.6) as theorems of Integral Calculus 
(Sect. III.5) and vice versa. This fact was exploited in Sect. II.4 on several oc- 
casions. “Integration by Substitution” (Eq. (II.4.14)) and “Integration by Parts” 
(Eq. (II.4.20)) now have a sound theoretical basis. One has only to require that the 
functions involved be continuous, so that the integrals exist. 
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Fundamental Theorem of Calculus 


jf AS 


Thm. 6.11 Thm. 5.10 
Lagrange exist ds 
ZA 
Thm. 6.10 
Rolle 
7 
Rem. >. 13 Thm. 6.4 Thm. 5.17 Thm. 4.5 
1s = de + fp fi>0s... Mean Val. Unif. Cont. 
ry 


Thm. 5.14 Thm. 3.5 Thm. 3.6 
f f< af g intermed. val. Max, Min 
Thm. 1.6 Thm. 1.12 Thm. 1.17 Thm. 3.3 
lim o< J sup Bolz. Weier. cont. func. 
Thm. 1.5 Thm. 1.8 
lim + Cauchy sequ. conv. 


oe | 


Def. of real numbers, Def. of lim, Logic 


FIGURE6.3. Genealogical tree of the Fundamental Theorem 


The Rules of de L’Hospital 


... entirely above the vain glory, which most scientists so avidly seek ... 
(Fontenelle’s opinion concerning 

Guillaume-Frangois-Antoine de L’Hospital, Marquis de Sainte-Mesme et 

du Montellier, Comte d’Antremonts, Seigneur d’Ouques, 1661-1704) 


Besides, I acknowledge that I owe very much to the bright minds of 
the Bernoulli brothers, especially to the young one presently Professor in 


Groningen. I have made free use of their discoveries . . . 
(de L’ Hospital 1696) 


We start with the following generalization of Lagrange’s Theorem 6.11. 
(6.14) Theorem (Cauchy 1821). Let f : [a,b] — Rand g : [a,b] — R be 


continuous on [a, b] and differentiable on (a, b). If g(x) 4 0 fora < x < 6, then 
g(b) 4 g(a) and there exists € € (a,b) such that 


(6.17) 
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Proof. We first observe that g(b) 4 g(a) by Rolle’s theorem, since g'(€) 4 0 for 
all € € (a,b). We then note that for g(a) = x this result reduces to Theorem 6.11. 
Inspired by the proof of this theorem, we replace (6.15) by 


(6.18) h(x) = f(x) Gc t (g(z) — g(a) oon 


The conditions of Rolle’s Theorem 6.10 are satisfied, and consequently there ex- 
ists € € (a,b) with h’(€) = 0. This is equivalent to (6.17). 


Problem. Suppose we want to compute the limit of a quotient f(x)/g(«). If both 
functions, f(x) and g(x), tend to 0 or to oo when x — 5, then we are confronted 
with undetermined expressions of the form 


0 ee 
= or = 
0 oO 


The following theorems and examples show how such situations can be handled. 


(6.15) Theorem (Joh. Bernoulli 1691/92, de L’Hospital 1696). Let f : (a,b) > R 
and g : (a,b) > R be differentiable on (a,b) and suppose that g'(x) 4 0 for 
a<a<bif 


(6.19) lim f(x) =0 and lim g(x) =0 


xz—b— 


and if lim f'(x)/g' (x) = X exists, then 


lim fle) _ lim f(a) 
2b g(x) 2 b— g!(x) 


(6.20) 


Proof. The existence of the limit of f’(x)/g'(a) for 2 — b— means that for a 
given € > 0 there exists a d > 0 such that 

/ 
EO al <e for b-bd <E<b. 
g'(§) 


(6.21) 


For u,v € (b — 6, b) it then follows from Theorem 6.14 that 


fk) 
g'(&) 


(6.22) ee -r) =| —A| <e. 


In this formula, we let v — b—, use (6.19), and so obtain | f(u)/g(u) — A| < e for 
b—6 <u< b. This proves (6.20). 


(6.16) Remark. With slight modifications of the above proof, one sees that 
— the theorem remains true for b = +00; 

— the theorem remains true for \ = +00 or \ = —ov; and 

— the theorem remains true for the limit x — a+. 
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(6.17) Theorem. Under the assumptions of Theorem 6.15, where (6.19) is re- 


placed by 
(6.23) lim f(x) =c and lim g(z) =O, 
we also have (6.20). 
Proof. We multiply (6.22) by g(v) = 9(u) 2 We g(u) , which gives 
g(v) g(v) 
f(v) = flu) glu) glu) 
(6.24) ey 0 - SB) << - 


We wish to isolate | f(v) /g(v) —A| in the expression on the left. Using the modified 
triangle inequality |A| — |B] <|A— B|(or|A| < |A— B| + |B]), we obtain 
g(u) 
v) 


| g( 


Now we keep wu fixed and let v — b—. Because of (6.23), the expression on the 
right side approaches ¢. Therefore, | f(v)/g(v) — A| < 2e for v sufficiently close 
to b. This proves the statement. 


=| <e1- 


f(v) f(u) — dg(u) 
g(e) er ORM 


Examples. The quotient of the functions f(x) = sinx and g(x) = x gives, for 
x — 0, the undetermined expression 0/0. Applying Theorem 6.15, we compute 
sin x COS & 


(6.25) lim = lim = 


z>0 2£ x0 1 


Obviously, these equalities have to be read from right to left. Since limz_,9 cos 7 = 
1 exists, lim, 9 sin z/a also exists and equals 1. 
Next, we consider f(x) = e°” (a@ > 0) and g(x) = x”, which both tend to 
oo for x — oo. Repeated application of Theorem 6.17 (and Remark 6.16) yields 
(6.26) 
ax ay 


lim — = lim = lim —————_~ =....= lim 
woo yn aso nent L—+00 n(n = 1)a—-? 


aet 2 eax 


This shows that the exponential function e®” increases faster (for 7 — oo) than 
any polynomial. 
For a > 0 we obtain from Theorem 6.17 and Remark 6.16 
l 1 1 
(6.27) fe i is 


@Z—co LX x—oo geet @oo axe 


= 0. 


Hence, any polynomial increases faster than a logarithm. 
Undetermined expressions of the form 


0-c or 0° or ee) 


can be treated as explained in the following examples: 
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. . Ina . 1/a : 
C20 te ie 


(6.29) jim 2 — jim, exp(aIn az) = exp( im, ein x) = exp(0) = 1, 


1 
(6.30) lim %/@= lim 2!/* = exp( lim =*) = exp(0) = 1. 
L—0O LOCO w>0O DL 


In the last two examples, we have exploited the continuity of the exponential func- 
tion. 


Derivatives of Infinite Series 


Where is it proved that one obtains the derivative of an infinite series by 
taking the derivative of each term? 
(Abel, Janv. 16, 1826, Oeuvres, vol. 2, p. 258) 


The term-by-term differentiation of infinite series is justified by the following the- 
orem. 


(6.18) Theorem. Let f,, : (a,b) — R be a sequence of continuously differentiable 
functions. If 


i) lim fn(x) = f(x) on (a,b), and 
ii) lim f/ (a) = p(x), where the convergence is uniform on (a,b), 


then f (a) is continuously differentiable on (a,b), and for all x € (a,b) we have 


(6.31) Jim Feary a: 


Proof. As we can guess, the essential “ingredient” of this proof (in addition to the 
Fundamental Theorem of Differential Calculus) is Theorem 5.19 on the exchange 
of limits and integrals. 

We fix xo € (a,b). Because { f/ (a) } converges uniformly on (a, b), we ob- 
tain 


fo erae= tim, fO p4(e) ae = tim (fale) = falt0)) = (2) — F(¢0. 


n— oo 
0 xo 


By Theorem 6.9, this shows that p(a) = f’(a) and that (6.31) holds. The conti- 
nuity of f’(a) follows from Theorem 4.2. 


(6.19) Counterexamples. The functions (see Fig. 6.4) 


ax 


1. 
= Tae and fn(2) = a sin(nz) 


(6.32) n(x) 


show that hypothesis (i) (even with uniform convergence) is not sufficient to prove 
(6.31). 
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FIGURE 6.4. Uniform convergence with lim f), # (lim fr)’ 


Exercises 


6.1 


6.2 


6.3 


6.4 


6.5 


Let a positive integer n be given and define f,, : R — R by 


x” sin(1/23 ifx £0, 
p= { (1/2") # 
0 ife=0. 
How often is f,, differentiable and which derivatives of f,, are continuous? 


Show by two different methods (using (6.1) as well as Carathéodory’s formu- 
lation (6.6)) that if g(a) is differentiable at x9 with g(xo) # 0, then 1/g(x) 
is also differentiable at xo. 


Show that the following function is increasing on [0, 1]: 


nal® (2 — cos(In x) — sin(In z)) O0<a<l1 
CS aa Has 


but that there are infinitely many points with f’(€) = 0. Is this a contradiction 
to Eq. (6.14)? Is f(a) differentiable at the origin? 


a) Let h : [a,b] — R be continuous on [a, b] and n times differentiable on 
(a, b). Show that if h(a) has n + 1 zeros in [a, 6], then there exists € € (a, b) 
with h(”) (€) = 0. 
Hint. Apply Rolle’s Theorem repeatedly. 
b) Set h(x) = f(x) — p(x), where p(x) is the interpolation polynomial on 
equidistant gridpoints (see Eq. (II.2.6)), and conclude that for an n times dif- 
ferentiable function f(a) (see Eq. (II.2.7), 

A" Yo _— ¢(n) 
(6.33) eee | (6). 


The function of Fig. 6.5, often called “the devil’s staircase”, shows that La- 
grange’s Theorem (see Corollary 6.12a) is not as trivial as it might appear. 
If x has a representation in base 3 as, e.g., x = 0.20220002101220..., then 
f(a) is obtained in base 2 by converting all 2’s preceding the first 1 into 1’s 
and deleting all subsequent digits, in our example f(a) = 0.101100011. In 
particular, 
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f(z) =4ifxe [$,3], f(z) =fifxe [5,2], f(x) =3ifwe [%, 8]. 
Show that this function is continuous and nondecreasing. It is differentiable with 


derivative f’(#) = 0 on a set of measure 1/3 + 2/9 + 4/274 8/81+... = 1, 
hence, as we say, almost everywhere. Nevertheless, f(0) 4 f(1). 


1.00 - 


0i- 


san 


0 1/3 2/3 1 
FIGURE6.5. The devil’s staircase 


ie 
6.6 Compute by L’Hospital’s Rule (and using logarithms) — lim (1 - -) 
L200 xv 
6.7 (Approximate rectification of the arc of a circle). Let a circle of radius | be 


given. For a point M on the circle let N be the point on the tangent at O such 
that NO = arc MO. Compute the position of P on the orthogonal diameter 
OC colinear with N and lM (see Fig. 6.6). What is the limiting position of P 
if a tends to zero ? 


N 


FIGURE6.6. Approximate rectification of the arc of a circle 


Remark. The answer is 3. Therefore, if P is placed exactly at the point x = 3, 
then VO is an excellent approximation for arc MO. 


6.8 Consider the sequence 


1 
fn(z) = 4/5 +2? = 127354008 
n 


Show that f,,(x) converges uniformly on [—1, 1] to a function f(x). Is f(a) 
differentiable? For which values of x is ie fi (2) = f(a)? 
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III.7 Power Series and Taylor Series 


After a scientific meeting at which Cauchy presented his theory on the con- 

vergence of series Laplace hastened home and remained there in reclusion 

until he had examined the series in his Mécanique céleste. Luckily every 

one was found to be convergent. (M. Kline 1972, p. 972) 
Let co, C1, C2, C3, .. . be a sequence of real coefficients and let x be the independent 
variable. Then, we call 


co 
(7.1) San =co tee + cox? +c3n° +... 
n=0 


a power series. In this section, we investigate the set of x-values for which the 
series (7.1) converges. We also study properties (continuity, derivative, primitive) 
of the function represented by (7.1). 


(7.1) Lemma. Suppose that the series (7.1) converges for a certain £. Then, it also 
converges for all x with |x| < |x|. 

Moreover, for each n withO <4 < |& 
and uniformly on the interval |—n, 1). 


, the series (7.1) converges absolutely 


Proof. The convergence of the series }> c,%" implies that the sequence {cpz” } 
is bounded (see Eq. (2.3) and Theorem 1.3), i.e., there exists a B > 0 such that 
\Cn£"| < B for all n > 0. Therefore, for |x| < 7, we have 


i 1 
x 


Jen" | < lenln” = Jeud| -|2 


By Theorem 2.5, this implies the convergence and the absolute convergence of 
> Cnx”. The uniform convergence follows from Theorem 4.3. 


(7.2) Definition. We set 
(7.2) oQ = sup {|2| + rp nv” converges } 


and call @ the radius of convergence of the series (7.1). We set 0 = o0 if (7.1) 
converges for all real x. 


(7.3) Theorem. The series (7.1) converges for all x satisfying |x| < 0, it diverges 
for all x satisfying |x| > @, and we have uniform convergence on |[—n, 1] if 0 < 
1< 0. 


Proof. Let x be a value with |x| < 9. Then, there is an & with |x| < || < @ such 
that (7.1) converges for % (put ¢ = (g — |a|)/2 in Definition 1.11). Thus, from 
Lemma 7.1, we have convergence for x. The uniform convergence on [—7, 7] is 
seen in the same way. 


This theorem says nothing about the convergence at x = —o and x = g. In 
fact, anything can happen at these points, as we shall see in the following example. 
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(7.4) Example. The series 


OO at 2 3 4 


(7.3) ee ee ee ee 


is the geometric series (apart from the first term; Example 2.2) for a = 0, reduces 
to —In(1—2) for a = 1 (see Eq. (1.3.14)), and is, for w = 2, “Euler’s Dilogarithm” 
(Euler 1768, Inst. Calc. Int., Sectio Prima, Caput IV, Exemplum 2). Independently 
of a, the radius of convergence of (7.3) is @ = 1 (see Example 7.6 below). For 
a = 0 the series diverges at both ends of the convergence interval. Fora = 1 
we have divergence for x = +1 (harmonic series), but convergence for x = —1 
(by Leibniz’s criterion). For a = 2 the series converges for x = +1 and also for 
x = —1 (see Lemma 2.6). 


Determination of the Radius of Convergence 


The following theorems give useful formulas for the computation of the radius of 
convergence. 


(7.5) Theorem (Cauchy 1821). /f lim |en/en+i| exists (or is oo), then, we have 


(7.4) o= lim 


n—0oo 


Cn+1 | 


Proof. We apply the Ratio Test 2.10 to the series 57, ay, with a, = cnx”. Since 


mt) = lal / lim 


n—Cco 


Cn+4+10 
Creel 


lim 
n—-coO 


n+1 ae 


sac | = lim 
an 


|x| lim 


n—Cco n—- co 


Cn+1 


the series (7.1) converges if |a| < lim|cen/cn4i|. For |2| > lim |en/cn+ 
diverges. This implies Eq. (7.4). 


me. 
o 


(7.6) Examples. For the series (7.3), where c, = 1/n®, we have |¢p/¢n41| = (1+ 
1/n)* — 1 for n — oo. Therefore, the radius of convergence is 9 = 1. Similarly, 
for the binomial series for (1 + 2)* (Theorem I.2.2) we have |en/Cn4i| = (n + 


1)/|a—n| > Lando =1. 

The series expansions for e” (see Theorem I.2.3) for sinx and cos x (see 
Eqs. (1.4.16) and (I.4.17)) have been proved to converge for all real x (Sect. III.2). 
Hence, their radius of convergence is g = oo. An example for a series with 0 = 0 
is 

Ltao+t+2Qle?+3le*+4!ot+.... 
Here, we have c,, = n! and |cen/en+1| = 1/(n +1) - 0. 
The formula of Theorem 7.5 is not directly applicable to the series 


fo nae gt 


7. =r1-—+—- s+... 
(7.5) arctanz’ = & rie 7 j 
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because |¢p/¢n+1| is alternatively 0 and oo. If we divide by x and replace x? 
by the new variable z, then the series (7.5) divided by x becomes )*,, cn. 2” with 
Cn = (—1)"/(2n + 1). For this series we have g = 1 by Theorem 7.5. Hence, the 
series (7.5) converges for |x?| < 1 (i-e., |z| < 1) and we have 9 = 1. 

While Eq. (7.4) requires the existence of the limit, the next result is valid 
without restriction (see also Exercise 7.1 below). 


(7.7) Theorem (Hadamard 1892). The radius of convergence of the series (7.1) is 
given by 
1 


9 = ———_—. 
limsup V/|cn| 
n—-co 


(7.6) 


Proof. We apply the Root Test 2.11 to the series 5°, ay with an = cnx”. Since 
lim sup ‘/|@n| = |2|-limsup V/ en], 
noo N—- Co 


we see that the series (7.1) converges if |x| < 1/limsup %/|c,|. It diverges if 


|z| > 1/limsup ¥/|en|- 


Continuity 

Let D be the domain of convergence 

(7.7) D= {ax | series (7.1) converges , 

so that the series (7.1) defines a function f : D — R given by 


(7.8) f(z) = S- Cyt” for xeD. 

n=0 
It is clear from the uniform convergence on [—7, 7] for 0 < 7 < @ (see Theorems 
7.3 and 4.2) that f(x) is a continuous function in the open interval (—@, @). The 
following famous theorem of Abel handles the question of continuity at the end 
points of the convergence interval. 


(7.8) Theorem (Abel 1826). Suppose that the series (7.8) converges for Xx) = @ (or 
for x9 = —@). Then, the function f(x) is continuous at x9 = @ (or at Xo = —@). 


Proof. For simplicity we assume that @ = 1 and zp = +1. Otherwise, we stretch 
and/or reverse the convergence interval by replacing x9 by +20/0. 

Since, by hypothesis, we have convergence for x9 = 1, it follows from 
Lemma 2.3 that forn > N andk > 1, 


(7.9) \Cn41 + Cn42 +... + Cn+p| SXE: 
Now, let x be chosen arbitrarily in {0, 1]. Then, for f,,(a) = )>}".y cia’ we have 


(7.10) frzk(@) — fala) = pag) + eggge™h? +24 Cn pe”™*. 
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If all c; are > 0, it is clear from Eq. (7.9) that | fn+%(2) — fn(x)| < €. Otherwise, 
we split up (7.10) somewhat more carefully (written here for k = 4): 


eng r* 4 Cnga™t4 ne, Cn4ga"t4 oeeage ts 
+¢n41 (gh4 22 tt) +Cn42 (arr = grt) +€n43 (ert te9 gers) 
+Cn41 (at 223 grt3 ) +Cn+92 (arr a gts) 


+¢n41 (grt mre) 
(this process is called Abel’s partial summation, see Exercise 7.2). In each row, 
we can now factor out a common (positive) factor 2” t*, a’ +*-l—a?tk, |... and 
obtain, by (7.9) and the triangle inequality, 


lfave(a) — fala)| <e- (a batt tot ptt og?) ce 


uniformly on [0,1]. Therefore, the continuity of f(x) at zo = 1 follows from 
Theorem 4.2. 


Differentiation and Integration 


Since Yn — 1 for n — ov (see Eq. (6.30)), it follows from Theorem 7.7 that the 
(term by term) differentiated and integrated power series have the same radius of 
convergence as the original series. We then have the following result. 


(7.9) Theorem. The function f(x) = 7-9 cnx” is differentiable for |x| < 0 
(where o is the radius of convergence and @ > 0), and we have 


(7.11) fay nen 
n=1 


It has a primitive on (—@, @), which is given by 


grt 


(7.12) [ f(t) dt = det 


Proof. For 0 < 7 < @ the convergence of these series is uniform on [—7, 7 
(and, of course, also on (—7, 77)). It then follows from Theorem 6.18 that f(x) 
is differentiable on (—7, 1) and that its derivative is given by (7.11). Similarly, 
Eq. (7.12) follows from Corollary 5.20. 


(7.10) Remark. If the series (7.1) converges, say, at x = g, then the differentiated 
series (7.11) need not converge there. This is the case, for example, with the series 
(7.3) for ~ = 2. However, the convergence of (7.1) at x = o implies the conver- 
gence of (7.12) at x = o (see Exercise 7.3). With the use of Theorem 7.8, we thus 
see that identity (7.12) holds for all x € D. 
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(7.11) Example. The geometric series (Example 2.2) has radius of convergence 
o = 1. Integrating it term by term, we obtain from Theorem 7.9 and the definition 
of In (Sect. 1.3) that for 7 € (—1, 1) 

“dt eg gt og? 
7A In(1 = — =r-—~+—-—-—4+—-.... 
(7.13) n(1 + 2) Lee page gt 
Moreover, the series in (7.13) converges for x = 1 and, by Theorem 7.8, we obtain 
In2=1-—1/2+1/3-—1/4+...., this time rigorously. 


Taylor Series 


. and to estimate the value of the remainder of the series. This prob- 

lem, one of the most important in the theory of series, has not yet been 

solved... (Lagrange 1797, p. 42-43, Oeuvres, vol. 9, p.71) 

In 1797 (second ed. 1813), Lagrange wrote an entire treatise basing analysis on 
the Taylor series expansion of a function (see Eq. (II.2.8)) 


z—a)’ a; 
(7.14) a=). (ay F(a); 
i=0 

which allowed him, as he thought, to banish infinitely small quantities, limits, and 
fluxions (“dégagés de toute considération d’infiniment petits, d’évanouissans, de 
limites ou de fluxions”). This dream, however, only lasted some 25 years. 

Regarding x — a as a new variable, this series is of the form (7.1) and the 
previous results on the convergence of the series can be applied. The first problem 
is that there are infinitely differentiable functions for which the series (7.14) does 
not converge for any x # a (see Exercise 7.6 below). But even convergence of the 
series in (7.14) does not necessarily imply the identity in (7.14), as we shall see in 
the subsequent counterexample. 


(7.12) Counterexample. 


... Taylor’s formula, which can no longer be admitted in general ... 
(Cauchy 1823, Résumé, p. 1) 


Cauchy (1823) considered the function 


ae yoy {Ue eae 
0 if x = 0, 


which is continuous everywhere. This function is so terribly flat at the origin (see 
Fig. 7.1), that f (0) = 0 for all i. In fact, by the rules of differentiation, we obtain 
(for x # 0) 
2 2 6 4 2 
! = ag hfe ” pany (eae Vo 3 t/e 
f@=ZeV",  fa)=(-Gt+a)-e 


ax 


2 0. 
—l/e" Since for 


. 2 
all n the functions «~"e~'/* tend to zero as x — 0 (see the examples after 


and we see that f(a) is a polynomial in 1/x multiplied by e 
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| | oF 
bas | 
| | -1E 


Wii ie | f 16 


FIGURE7.1. Graph of e~'/*” and its derivatives 


Theorem 6.17), we have f(a) — 0 for a — 0. The fact that also f (x)/a — 0 
for « — 0 implies that f+) (0) = lim, 49 f (h)/h = 0. 

Thus, the Taylor series for the function f(x) of (7.15)isO+0+0+4...and 
obviously converges for all x. But, formula (7.14) is wrong for x 4 0. 


In order to establish Eq. (7.14) for particular functions, we have to consider 
partial sums of Taylor series and to estimate their error. A useful formula in this 
context has already been derived at the end of Sect. II.4. It is summarized in the 
following theorem. 


(7.13) Theorem. Let f(x) be k +1 times continuously differentiable on |a, x] (or 
on |x, a] ifx < a). Then, we have 


The Binomial Series. 


... but the one which gives me most pleasure is a paper ... on the simple 


series ( 1) 
mim — 
l+ma+— — a +... 


I dare say that this is the first rigorous proof of the binomial formula ... 
(Abel, letter to Holmboe 1826, Oeuvres, vol. 2, p. 261) 


A rigorous proof of the binomial identity 
-1 —1)(a-2 
(7.16) (tay aitar+ Me Mee ay 


for |x| < 1 and arbitrary a was first considered by Abel in 1826. A proof based on 
Taylor series can be found in Weierstrass’s lecture of 1861 (see Weierstrass 1861). 
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If we put f(x) = (1+ .2)* and compute its derivatives f’(x) = a(1+2)*"1, 
f" (x) = a(a— 1)(1+ x)*~?,..., we observe that the series of (7.16) is simply 
the Taylor series of f(a) = (1+ 2)*. Its radius of convergence has been computed 
as 9 = 1 in Example 7.6. In order to prove identity (7.16) for |x| < 1, we have to 
show that the remainder (see Theorem 7.13) 


G7.) Rela) = [SEP (a1)... (a= WNL a 


converges to zero for k — oo. 
Using Theorem 5.17 and putting € = 6,2 with 0 < 6, < 1, we obtain 


Ri(z) = (2 FAO a(a—1)---(a— BY + Hp) 2 
(a—1)(a—2)---(a—k) Looe = 
= Soe at () (1+ 6.0)? - ae. 


The factor ax is a constant; (1 + 0,2°)?~+ lies between (1 + x)*~+ and 1 and is 
bounded; 0 < 1— 6, < 1+ 6,« for all x satisfying |x| < 1 implies that the factor 


((1 — On)/(1 + Oy)" is bounded by 1. Since the remaining factor 


(a—1)(a—-2)---(a@-k) 
a £9 i 
k} 
is, for |a| < 1, the general term of a convergent series, it tends to zero by (2.3). 
Consequently, we have Ry(x) — 0 for k — oo and the identity (7.16) is estab- 
lished for |x| < 1. 
Whenever the series (7.16) converges for x = +1 or x = —1, it represents a 

continuous function and thus equals (1 + 2)® at these points also (Theorem 7.8). 


Estimate of the Remainder without Integral Calculus. The attempts of La- 
grange (1797) to evaluate the remainder in Taylor’s formula were crowned by the 
following elegant formulas (“ce théoréme nouveau et remarquable par sa simpli- 
cité et sa généralité ...”): 


f(x) = f(a) + (e-a)f"© 
ais) fa) = F(a) + (ea) f(a) + HO preg 
w—-a 2 La 3 
flv) = fla) + (ea) (a) + FL pr(ay 4 FT" pg, 


etc., where € is an unknown value between a and x. 


(7.14) Theorem (Lagrange 1797). Let f(x) be continuous on [a,x] and k + 1 
times differentiable on (a,x). Then, there exists € € (a,x) such that 
(x _ ayer FE DE). 

(k+1)! 


k i 
fl) => FS p (0) + 


i! 
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Proof. We follow an elegant idea of Cauchy (1823), denote the remainder by 


(x — a)! 


k 
(7.19) Re (a) = f(e)— 9) FC), 


i=0 
and compare it to the function $;(a) = (x — a)**1/(k + 1)!. We have 


Ry(a)=0, Ry(a)=0, ... , RY (a) =0, 


we get 


Re(z) _ Re(z)— Rela) _ Ri(&.) _ Ry(é) — Rela) 
Sila) ~ See) = Sela) ~ SG) ~ Seer) = Sela) 
_ RUG) _ RU(&)— REQ) REY (Guat) 
er ~ SiG) — Sila) — Sila) ~~ BE)’ 


where &; lies between x and a, £2 between €; and a, and so on. Since gery (x) = 
Land RET) (x) = f+) (x), we obtain from (7.20) that 


R(x) = Sp(x)- fPIY (6) 


with € = €,,,. This completes the proof of the theorem. 


Remark. The relation between the remainders of Theorems 7.13 and 7.14 is given 
by Theorem 5.18. For the original proof of Lagrange see Exercise 7.8 below. 


Exercises 
7.1 Determine the radius of convergence of the series 
f(z) =14+ 2a +e? +207 + ¢44 2274+... 


and show that Theorem 7.5 is not applicable, but that Theorem 7.7 is. 


7.2. (Partial summation, Abel 1826). Let {a,,} and {b,,} be two sequences. Prove 
that 


N N 
S- anbn = S- An(bn =< bn+1) a Anbn+i1 a; A_1bo, 
n=0 n=0 


where A_; = ais an arbitrary constant and A, = a+ ap +a, +...+ dn. 
Hint. Use the identity 


Andy, = (An a An-1)0n = An (bn - bn+1) +. An—iby + Anbn+1- 


7.3. Consider the series 


[o.e) [oe) 
Cn and —. 


n=1 n=1 


256 


74 


75 


7.6 


7.7 
7.8 


7.9 
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Prove that the convergence of the first series implies that of the second. The 
proof will encounter a difficulty similar to that in the proof of Theorem 7.8, 
which can be settled by a similar idea (partial summation, see preceding ex- 
ercise). 
Investigate the convergence of the series of Newton-Gregory 

ta. Tee, Lkcpe 


‘ x 
arcsin(t) =2@+5— 5+ 5G 5 The Pte 
fore = landx=-—1. 
Hint. Wallis’s product will be useful for understanding the asymptotic behav- 
ior of the coefficients. 
Let D’ be the domain of convergence for the 
series in Eq.(7.11). Prove that the identity in 
Eq. (7.11) holds for all x € D’. 
An infinitely differentiable function whose Tay- 
lor series does not converge (see Lerch 1888, 
Pringsheim 1893); show that the series 

cos2x  cos4z  cos8x cosl6z 
LG Sa ol 3 mD 


and all its derivatives converge uniformly in R. aif 
Show that its Taylor series at the origin is ff} 
fO)+ f'O)at+... 
ef-1 , 
y A a 


and diverges for all x 4 0. 
Nevertheless, for the computation of, say, f(0.01) (correct value f(0.01) = 
1.71572953) the first two terms of this series are useful. Why? 


Investigate the convergence of the series (7.16) for x = 1 andx = —1. 


e_1 , © 


ee rm 


Find formulas (7.18) in the footprints of Lagrange by using, as we would say 
today, a “homotopy” argument. 
Hint. Put 


! oa" x 
(7.21) f(x) = fw 2a) + 2af'(a — 2a) + = ae 


where z is a variable between 0 and 1 and where z is considered as a fixed 
constant. Setting z = 0, we find R(0) = 0, and with z = 1, we see that 
(2° /3!)R(1) is the error term we are looking for. Now, differentiate (7.21) 
with respect to z and find R’(z) = 32? f(a — za). Finally, integrate from 
0 to 1 and apply Theorem 5.18. 

(Abel 1826). Prove that if the series }7; ai, )), 6; and their Cauchy product 
converge, identity (2.19) holds. 

Hint. Apply Abel’s Theorem 7.8 to the function f(a) - g(x), where f(x) = 


do, ax! and g(x) = 9), bya. 


f"(@— 22) + = R(z), 
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IL.8 Improper Integrals 


The theory of the Riemann integral i f(x) dx in Sect. ILS is based on the as- 
sumptions that [a, b] is a finite interval and the function f(a) is bounded on this 
interval. We shall show how these restrictions can be circumvented. If at least one 
of the two assumptions is violated, we speak of an improper integral. 


Bounded Functions on Infinite Intervals 


(8.1) Definition. Let f : [a,0o) — R be integrable on every interval [a, b] with 
b > a. If the limit 


i, f(x) dx := lim [s@ da 


b— 00 
exists, then we say that f(x) is integrable on |a,0o) and that [~ f(a) da is a 


convergent integral. 


Only wimps do the general case. True teachers tackle examples. 
(Parlett, see Math. Intelligencer, vol. 14, No. 1, p. 35) 


(8.2) Examples. Consider first the exponential function on the interval [0, 00). By 
Definition 8.1, we have 


oo b b 
| e “dx = lim e "dx = lim (-e-* ) = lim (1—e*) =1. 
0 0 


b—o0o 0 b— co b—0o 


Once we are accustomed to this definition, we simply write 


CO 


(8.1) | e "“dxr=-e*| =1. 
0 ) 
Next, consider the function ~~“ on [1, 00): 
~ | ee 1-a@ 00 diverges ifa<l 
(8.2) | <-/ ade = = = eee 
xe 1 l-alh (a —1) ifa>l. 


For a = 1 a primitive is In x and the improper integral diverges. 
But how can we check the integrability on [a, oo) if no primitive is known 
explicitly? 


(8.3) Lemma. Let f : [a,0o) — R be integrable on every interval (a, b}. 

a) If |f(x)| < g(x) for all x > a and if Be g(x) dx is convergent, then 
J-* f(x) dx is also convergent. 

b) If0< g(x) < f(x) for all x > a and ie Ue g(x) dx is divergent, then 
J-* f(x) dx also diverges. 
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Proof. Part (a) follows from Cauchy’s criterion (Theorem 3.12), and from Theo- 


rem 5.14, because | i (ee We |f(x)| da < i g(x) dx < « for sufficiently 
large b < b. Part (b) is obvious. 


(8.4) Example. For a > 0 we consider the function (1 + 2®%)~+ on the interval 
(0, co). We split the integral according to 


od Tod oO | 
(8.3) | Z =f ae +f a 
0 14+ 2 0 14+ 2 1 1+ 2% 


The first integral is “proper’’. For the second integral we use the estimates 


1 
rey for «>1. 


It thus follows from Lemma 8.3 and Eq. (8.2) that the integral (8.3) converges for 
a > | and diverges for a < 1. 


sin x 


i i ae Bi | 
9 i: | ae 32 "4 


FIGURE 8.1. Graph of sin x/x 


(8.5) Example. Let us investigate the existence of 


(8.4) if cake oe 
0 


ax 


The function f(x) = sin x/x is continuous at « = 0 with f(0) = 1 and so poses 
no difficulty at this point. Using the estimate | sin x| < 1 would be pointless, since 
the integral ee x‘ dx diverges. But the graph of f(x) (see Fig. 8.1) shows that 
the integral can be written as an alternating series of the form ag — a1 + ag — a3 + 
..., where 


* sing 2m sing 3 sing 
ao = dx, ay=— dz, ag= — dz, 
0 x T x Qr x 


This series converges by Leibniz’s criterion (Theorem 2.4). The condition aj41 < 
a; can be verified with help of the substitution x +> x — 7 and a; — 0 follows 
from the simple estimate 0 < a; < 1/1. 
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(8.6) Theorem (Maclaurin 1742). Let f(a) > 0 be nonincreasing on |1, 00). Then, 
we have 


S- f(n) converges => | f(a)dx converges . 
n=1 , 


FIGURE 8.2. Majorization and minorization of f(x) 


Proof. Let g(x) = f([a]) and h(x) = f([z] + 1) be the step functions drawn in 
Fig. 8.2 (here [a] denotes the largest integer not exceeding «). These functions are 
integrable on finite intervals (Theorem 5.11), and, since f(x) is monotonic, we 
have h(x) < f(a) < g(a) for all x. Consequently, 


N N N-1 
S- #(n) < | f(a) dx < So f(n) 


n=1 
and the statement follows from Theorem 1.13 since f(x) > 0. 


As integrals are often easier to calculate than sums, this theorem is very 
useful for discussing the convergence of series. For example, the computation of 
Eq. (8.2) gives an elegant new proof for Lemma 2.6. 

If we try to study what happens “between” the divergent series )> 1/n and 
the convergent series )* 1/n° (for some a > 1), we are led to the investigation of 


Co 


1 
Ge) 2 alae 
n=2 


(for large n and any a > 1 and GB > Owe have n < n(Inn)? < n® by Eq. (6.27)). 
With the transformation wu = In x, we have 


[ dx -[ 5 
2 a: (Ina)o ing UP’ 


and Theorem 8.6, together with Eq. (8.2), proves that the series (8.5) converges 
for 3 > 1, but diverges for G < 1. 
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Integrals from —oo to +00. It would be injudicious to define 


(8.6) - f(«) dx = lim [ f(x) dx 


(if the limit exists). This would produce nonsense, for example, by applying the 
transformation formula (II.4.14) with z = x +1 (dz = dz). With the above 
definition, we would have 


+00 +00 
/ zdz=0 and ii (x +1)dxr =o. 


—Co —co 


(8.7) Definition. Let f : R — R be integrable on every bounded interval (a, b}. 
Then, we say that 


ie f(a) de = i fa)de+ f() dex 


exists if both improper integrals to the right exist. 


The two integrals 


CoO d Co 
Ves Tae — and [. e* dz 


converge in the sense of Definition 8.7. The first one tends to 7 (a primitive is 
arctan x). The convergence of the second integral is seen from Lemma 8.3 by 
using e= <e* forx > 1. 


Unbounded Functions on a Finite Interval 


(8.8) Definition (Gauss 1812, §36). If f : (a,b] — R is integrable on every 
interval of the form |a + €, b], then we define 


b ; b 
[ feyae = Jim, f fle) de 


if the limit exists. 


This definition includes situations where |f(xz)| — oo for — a. A si- 
milar definition is possible when | f(x)| — oo for x — b. In order to check the 
integrability of such a function, Lemma 8.3 can be adapted without any difficulty. 


(8.9) Examples. For the function «~° considered on the interval (0, 1] we have 
1 1 to qi 7 
d. d diverges ifa>l 
(8.7) | ee Serine. fe ee) te Pea oe 
9 %  es0+/, 2% es0+ l-a (l—a)7 ifa<l. 


The case a = 1 also leads to a divergent integral. Hence, the hyperbola y = 1/x 
(a = 1) is the limiting case with infinite area on the left (0 < x < 1) and on the 
right (x > 1). If a decreases, the left area becomes finite, if a increases, the right 
area becomes finite. 


€ 


I.8 Improper Integrals 261 


losing long 1 
dx = . ; dx 
Qo 0 x Goo 


converges if and only if a—1 < 1l,ie., a < 2. This is due to the fact that 
f(x) = sina/zx is continuous at zero with f(0) = 1. 


The integral 


Euler’s Gamma Function 


Throughout his life, Euler was interested in “interpolating” the factorials 0! = 
1, 1! 1, 2! 2, 3! 6, 4! 24,... at noninteger values. He wrote for 
this 1-2-3-4-...-a (“De Differentiatione Functionum Inexplicabilium”, see 
1755, Caput XVI of Inst. Calc. Diff:, Opera, vol. X). He finally found the definition 
(totally “explicabilium”’) used today in 1781: integration by parts applied to the 
following integral (with u(x) = x”, u'(x) = e~*) yields 


ioe) CO 
+ nf xv” te-* de. 
0 0 


(8.8) | xe *dx =—-a"e* 
0 


The term «”e~* vanishes for x = 0 (n > 0) and for 2 — cv, so we find that 
(8.9) | ve "dx =n! 
0 
Here, we have no problem replacing n by a noninteger real number: 
(8.10) Definition. For a > 0 we define 


(8.10) i(a)i= oe oo te-* dx. 
0 


We have to show that the integral of Eq. (8.10) is convergent. There are two 
difficulties: the integrated function is unbounded for x — 0 (if a < 1) and the 
integration interval is infinite. We therefore split the integral into 


1 fore) 
(8.11) | aie a a+ | go te-® da. 
0 1 


It follows from the estimate r%~!e-* < 2%—!, from Lemma 8.3, and from 


Eq. (8.7) that the first integral in (8.11) converges for a > 0. For the second 
integral in (8.11) we use the estimate x°~!e~* = 2° 1e~*/? . e-*/2 < Me~*/? 
(see the examples after Theorem 6.17) and again Lemma 8.3. 

Equation (8.9) and the computation in Eq. (8.8) show that 


(8.12) I(n+l1)=n!, I'(a+1)=al(a) for a>0. 


With the help of the second relation in (8.12), one can extend the definition of 
I'(q) to negative a (a 4 —1, —2,—3,...) by setting 
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(8.13) 


T(a-1)= 


I(a) 


a-l 


(see Fig. 8.3). We shall see in Sect. IV.5 that ['(1/2) = x. 


Exercises 


aL : 
HF 2! 
ik‘ : 
0! 1! 

| | | | 0 | | | | 

4 3 2 -!1 1 2 3 4 
—| f= 
—2 JL 
3 i, 
[* q 

FIGURE 8.3. Gamma function 


8.1 Show that the Fresnel integrals (see Fig. II.6.2) 


co co 
i sina dz , ‘| cos «7 dx 
0 0 


converge (you can also use a change of coordinates and find an integral sim- 


ilar to (8.4); compare with Fig. 8.1). 


8.2 Show that for the sequence 


limynsoo Gn exists and 1 < limp an < 2 (it might be helpful to remember 


er 
dn = 2/n— 5° — 
fa VI 


that [(1//x) dx = 2/z). 


8.3 Show, by using an appropriate change of coordinates, that 


[ e® dx = etch 


2 \2 
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II.9 Two Theorems on Continuous Functions 


This section is devoted to two results of Weierstrass. The first proves the existence 
of continuous functions that are nowhere differentiable. The second shows that a 
continuous function f : [a,b] — R can be approximated arbitrarily closely by 
polynomials. 


Continuous, but Nowhere Differentiable Functions 


Until very recently it was generally believed, that a . .. continuous function 
... always has a first derivative whose value can be indefinite or infinite 
only at some isolated points. Even in the work of Gauss, Cauchy, Dirichlet, 
mathematicians who were accustomed to criticize everything in their field 
most severely, there can not be found, as far as I know, any expression of a 
different opinion. (Weierstrass 1872) 


A hundred years ago such a function would have been considered an out- 
rage on common sense. 
(Poincaré 1899, L’oeuvre math. de Weierstrass, Acta Math., vol. 22, p.5) 


FIGURE9.1. Riemann’s function (9.1) near x = 7 


Before the era of Riemann and Weierstrass, it was generally believed that every 
continuous function was also differentiable, with the possible exception of some 
singular points (see quotations). In 1806, A.-M. Ampére (a name that you have 
surely heard) even published a “proof” of this fact (J. Ecole Polyt., vol. 6, p. 148). 
The first shock was Riemann’s example (5.24), which, when integrated, produces 
a function which is not differentiable on an everywhere dense set of points. This 
opened the way to the search for functions that were nowhere differentiable. About 
1861 (see Weierstrass 1872), Riemann thought that the function (see Eq. (3.7)) 


[o.e) 4 2: 
1) f(@=>> oe) = sina + 7 sin(42) fe x sin(92) fils 


n=1 
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which is continuous since the series converges uniformly (see Theorems 4.3 and 
4.2), is nowhere differentiable. Weierstrass declared himself unable to prove this 
assertion and, indeed, Gerver (1970) found that (9.1) is differentiable at selected 
points, for example at x = 7 (see Fig. 9.1). 


(9.1) Theorem (Weierstrass 1872). There exist continuous functions that are 
nowhere differentiable. 


Proof. Weierstrass showed, after two pages of calculation, that 
(9.2) f(x) = 5° b” cos(a”x), 
n=1 


which is uniformly convergent for b < 1, is nowhere differentiable for ab > 
1+37/2. Many later researchers, intrigued by this phenomenon, found new exam- 
ples, in particular Dini (1878, Chap. 10), von Koch (1906, see Fig. IV.5.6 below), 
Hilbert (1891, see Fig. IV.2.3 below), and Takagi (1903). Takagi’s function was 
reinvented by Tall (1982) and named the “blancmange function”. This function is 


defined as follows: we consider the function 
£ O0<a<1/2 

(9.3) K(a2) = <asl/ 
l-« 1/2<a<1 


and extend it periodically (i.e., K(a + 1) = K(z) for all x) in order to get a 
continuous zigzag function. Then, we define (see Fig. 9.2) 


4) f(z) = > ao (22) = K(2)+5 Kx) +7 K (da) += K(80)+. = 
n=0 


Since |K(x)| < 1/2 and 1+1/2+1/4+1/8+... converges, the series (9.4) 
is seen to converge uniformly (Theorem 4.3) and represents a continuous function 
f(x) (Theorem 4.2). 

In order to see that it is nowhere differentiable, we use an elegant argumenta- 
tion of de Rham (1957). Let a point xo be given. The idea is to choose a, = 1/2” 
and 3, = (i + 1)/2”, where ¢ is the integer with a, < xp < G,, and to consider 
the quotient 


f (Bn) — flan) 


(9.5) Te Bee 


Since at the values a,, and 3, the sum in (9.4) is finite, r,, is the slope of the 
truncated series Sa 37K (21x) on the interval (an, n) (see Fig. 9.2 where, for 
Xo = 1/3, these slopes can be seen to be 0, 1, 0, 1...). 

With increasing n, we always have rn41 = Tn + 1, and the sequence {r,, } 
cannot converge. 


On the other hand, {r,,} is a mean of the slopes 


F(Bn) = Fl00) 4 yy Fle) = Flew) 


Bn — Lo TQ — An 


fn = An 
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0 1/3 1 
FIGURE9.2. The “blancmange” function 


where A» = (Bn — X0)/(Gn — An) € (0, 1] Gf an = xo we have A, = 1 and the 
second term is not present). Differentiability at 79 would therefore imply that 


In — f’(t0)| < Ane + (1 — Ane =e 


for sufficiently large n, which is a contradiction. 


Weierstrass’s Approximation Theorem 
This is the fundamental proposition established by Weierstrass. 
(Borel 1905, p. 50) 


We have just digested the first Weierstrass surprise, which is the existence of con- 
tinuous functions without a derivative; now comes the second: we can make them 
differentiable as often as we want, even polynomials, if only we allow an arbitrar- 
ily small error e. 


(9.2) Theorem (Weierstrass 1885). Let f : [a,b] — R be a continuous function. 
For every € > 0 there exists a polynomial p(x) such that 


(9.6) |p(x) — f(a)| <e forall «x € [a, 6}. 


In other terms, f(x) —€ < p(x) < f(x) +¢, ie, the polynomial p(x) is bounded 
between f(x) — «and f(x) + on the entire interval [a, b]. 
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) 
FIGURE 9.3a. Dirac sequence (9.7) FIGURE 9.3b. Mass concentration 


The list of mathematicians, compiled from Borel (1905, p. 50) and Meinardus 
(1964, p.7), who provided proofs for this theorem, shows how much they were 
fascinated by this result: Weierstrass (1885), Picard (1890, p. 259), Lerch 1892, 
Volterra 1897, Lebesgue 1898, Mittag-Leffler 1900, Landau (1908), D. Jackson 
1911, S. Bernstein 1912, P. Montel 1918, Marchand 1927, W. Gontscharov 1934. 
This theorem, which is related to approximation by trigonometric polynomials, 
has also been generalized in various ways (see Meinardus 1964, §2). The follow- 
ing proof is based on the idea of, as we say today, “Dirac sequences”. 


Dirac Sequences. We set, with Landau (1908, see Fig. 9.3a), 


n(l—2?)” if -l<a<1 
(9.7) gn(a) = ei one cs 
0 otherwise, 
where the factor 
_ 1-3-5-7-...-(2n+1) 
Oe) ie OaAs aca 
is chosen such that 
+00 
(9.9) i n(x) dx = 1 


(see Exercise II.4.3). These functions concentrate, for increasing n, more and 
more of their “mass” at the origin: 
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(9.3) Lemma. Let ~,,(x) be given by (9.7). For every € > 0 and for every 6 > 0 
with 0 <6 < 1 there exists an integer N such that for alln > N (see Fig. 9.3b) 


6 
(9.10) 1-e<f n(x) dx <1, 
6 


—6 1 
(9.11) i pn(a) dx + | n(x) dx < €. 
-1 6 


Proof. We start with the proof of (9.11). Since 1 — xz? >1—2for0<a«<1,we 
1 2\n 1 n af 
have f, (l—2?)" da > fj, 1-2)" dx = 1/(n+1), and therefore pp < 5(n+1). 
Hence, we have for 6 < |a| < 1 
0 < Yn(x) < gn(d) < 4(n +1)-GQ- 6°)”. 
Now g := 1 — 6? < 1 and (1 — 67)" = q” decreases exponentially, so that 
(n+ 1)- (1 — 67)” — 0 (see (6.26)). This implies that for n sufficiently large 


0 < gn(x) < €/2 for 6 < |a| < 1, and Eq. (9.11) is a consequence of Theorem 
5.14. The estimate (9.10) is obtained by subtracting (9.11) from (9.9). 


A Proof of Weierstrass’s Approximation Theorem. We may assume that 0 < 
a <b < 1 (the general case is reduced to this one by a transformation of the form 
x ++ a+ Gx with suitably chosen constants a and 3). We then extend f(z) to 
a continuous function on [0, 1], e.g., by putting f(z) = f(a) forO < x < aand 
f(x) = f(®) for b < « < 1. Then, we set for € € [a, }] 


aC rs | f(a)on (a — 8) de = pin ‘i f(a) (1 — (@ — €)2)" de. 


If we expand the factor (1 — (x — €)?)” by the binomial theorem, we obtain a 
polynomial in € of degree 2n, whose coefficients are functions of x. On inserting 
it into (9.12), we see that p,,(€) is a polynomial of degree 2n. 


Motivation. For a fixed € € [a, 6] the function y,, (a — €) will have its peak shifted 
to the point € (Fig. 9.4). Hence, the product f(x) - y,(a — €) multiplies (more or 
less) the peak by the value f(€). We therefore expect, because of (9.9), that the 
integral (9.12) will be close to f(§). 


Estimation of the Error. For the error between p,(€) and f(&) we shall use the 
triangle inequality as follows: 
1 
rm - SOS | f Fo pne-Oav— f 
) g 


E+06 


fw) en (a - £) de| 


—6 


—6 


(9.13) a | ie f(&) pn (@ — €) dx — ie f(E)Gn(z — €) da 
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FIGURE 9.4. Landau’s proof 


We fix some ¢ > 0. Since f is continuous on [0, 1], it is uniformly continuous 
there (Theorem 4.5). Hence, there exists a d > 0 independent of € such that 


(9.14) If@)-fOl<e if la—g)<o. 


This 6 is, if necessary, further reduced to satisfy 6 < a and 6 < 1 — b. Hence, we 
always have [€ — 6,€ + 6] C [0,1]. Furthermore, the function f(x) is bounded, 
i.e., satisfies | f(x)| <M for x € [0,1] (Theorem 3.6). 

The three terms to the right of Eq. (9.13) can now be estimated as follows: 
for the first one we use boundedness of f(a) and Eq. (9.11) and we see that it 
is bounded by Me; similarly, the use of Eq. (9.10) shows that the third term is 
bounded by Me; finally, it follows from (9.14) and (9.9) that the second term is 
bounded by e. We thus have 


Ipn(S) — FE) S QM + Ve 


for sufficiently large n. Since this estimate holds uniformly on [a, b], the theorem 
is proved. 
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Dds 50 75 1.00 
FIGURE9.5. Convergence of polynomials (9.12) to f(x) of (9.15) 


0 


(9.4) Example. Consider the function f : [1/8, 7/8] — R defined by 
~3.2¢ + 0.8 if 1/8<2<1/4, 
1/64—-(@—-3/82 ~—oif 1/4< 2 <1/2, 
1/64—(x—5/8? if 1/2<2<3/4, 
7.6” — 5.7 if 3/4<a<7/8. 


(9.15) f@)= 


As in the above proof, we extend it to a continuous function on [0, 1]. The poly- 
nomials p,,(€) of Eq. (9.12) are plotted in Fig. 9.5 for n = 10, 100, and 1000. We 
can observe uniform convergence on [1/8, 7/8] but not on [0, 1]. This is due to the 
fact that for € = 0 or € = 1 half of the peak of y,,(x) is cut off in (9.12). The 
hypothesis 0 < a < b < 1 in the above proof can therefore not be omitted. 

The graphs in Fig. 9.5 were actually computed by numerically evaluating the 
integral in (9.12) for 400 values of € by a method similar to those described in 
Sect. II.6. It would be a waste of effort to calculate the 2000 coefficients of the 
polynomial. 


Exercises 


9.1 Show, with the help of Wallis’ product, that the factors 1, in (9.8) behave, 
for n — co, asymptotically as \/n/7, and that the estimation in the proof of 
Lemma 9.3 is a little crude. 
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9.2 


9.3 


9.4 


9.5 
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Show that 
2 
(9.16) ~n(z) =,4/-e°"” , m= 1,253) 243 


is a Dirac sequence, i.e., satisfies (9.9), (9.10), and (9.11) (we shall see in 
Sect. IV.5 that a aan e? dr = \/7 ). This was actually the sequence on 
which Weierstrass based his proof. 


Find the constants c,, such that 


TX n 
(9.17) n(x) = Cn (cos() } -1<2<l, 


0 otherwise 


is a Dirac sequence (see Exercise 5.6). 
This sequence, with the help of trigonometric formulas like (1.4.4’), leads to 
approximations on [—7,, 7] by trigonometric polynomials. 


Let 
n if |z| < 1/(2n), 
tea {2 tls 1/0m 
0 otherwise. 
Show that for every continuous function f(x) 
b 
lim n(x — €) f(a) dx = f(§) foralla<€ <b. 


Expand (1 — (a — €)”)? in powers of € and show that 


"44 cos(a* + ./z) — sin(3z) MS 
I HEE nO ) da 


is a polynomial in €. 


IV 


Calculus in Several Variables 


Drawing by K. Wanner 


The influence of physics in stimulating the creation of such mathematical 
entities as quaternions, Grassmann’s hypernumbers, and vectors should be 
noted. These creations became part of mathematics. 

(M. Kline 1972, p.791) 
Functions of several variables have their origin in geometry (e.g., curves 
depending on parameters (Leibniz 1694a)) and in physics. A famous problem 
throughout the 18th century was the calculation of the movement of a vibrating 
string (d’ Alembert 1748, Fig. 0.1). The position of a string u(a,t) is actually a 
function of x, the space coordinate, and of t, the time. An important breakthrough 
for the systematic study of several variables, which occured around the middle of 

the 19th century, was the idea of denoting pairs (then n-tuples) 


(%1,%2) =: x (21, %2,...,2n) =: x 


by a single letter and of considering them as new mathematical objects. They were 
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called “extensive Grésse” by Grassmann (1844, 1862), “complexes” by Peano 
(1888), and “vectors” by Hamilton (1853). 


FIGURE0.1. Movement of a vibrating string (harpsichord) 


The first section, IV.1, will introduce norms in n-dimensional spaces, which 
enable us to extend the definitions and theorems on convergence and continuity 
quite easily (Section IV.2). However, differential calculus (Sections IV.3 and IV.4) 
as well as integral calculus (Section IV.5) in several variables will lead to new 
difficulties (interchange of partial derivatives, of integrations, and of integrations 
with derivatives). 
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IV.1 Topology of n-Dimensional Space 


It may appear remarkable that this idea, which is so simple and consists ba- 
sically in considering a multiple expression of different magnitudes (such 
as the “extensive magnitudes” in the sequel) as a new independent magni- 
tude, should in fact develop into a new science; ... 

(Grassmann 1862, Ausdehnungslehre, p.5) 


... it is very useful to consider “complex” numbers, or numbers formed 
with several units, ... (Peano 1888a, Math. Ann., vol. 32, p. 450) 


We denote pairs of real numbers by (21, 22), n-tuples by (x1, 22,...,2n), and 
call them vectors. The set of all pairs is 


(1.1) R? =Rx R= {(21, 22); 21,22 € R} 

and the set of all m-tuples is denoted by 

(2) Re =RX RX ok R=H=4 G1, tose ty) tee RRL, <r}. 
Vectors can be added (componentwise) and multiplied by a real number. With 


these operations, we call R” an n-dimensional real vector space. 


Distances and Norms 


The two-dimensional space R? can be imagined as a plane, the components x; and 
x2 being the cartesian coordinates. The distance between two points x = (21, 72) 
and y = (y1, y2) is, by Pythagoras’s Theorem, given by (Fig. 1.1) 


(1.3) d(z,y) = V(y1 — 21)? + (yo — @2)?. 


This distance only depends on the difference y—w and is also denoted by ||y—<]l2, 


where ||z||2 = «/ 27 + 23 if z = (21, 22). 


F=();; Yo. 3) 
y2 37x. 
Se Sal »D = (Xj, X5, X3) 
Yo-X2 
x yy “RN 
2 yy) =| sil PSS 
Tere ae ; 
I y,-* 
: 3 — Bee C 1 
FIGURE 1.1. Distance in R? FIGURE 1.2. Distance in R? 


In three-dimensional space, the distance between x = (x1, 22,23) andy = 
(y1, Y2, ys) is obtained by applying Pythagoras’s Theorem twice (first to the trian- 
gle DEF and then to ABC, see Fig. 1.2). In this way, we get d(x, y) = ||y — allo, 


where ||z||2 = \/z7 + 25 + 23. 
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For n-dimensional space R” we define, by analogy, 


(1.4) IZllo = 4/22 + 22 +... +22, 


and call it the Euclidean norm of z = (21, 22,.--,%n). The distance between 
x € R" and y € R” is then given by d(x, y) = ||y — 2|lo. 


(1.1) Theorem. The Euclidean norm (1.4) has the following properties: 

(N1) |a||>0 and |r| =O0Oearc=0, 

(N2) Az] = |Al- [lal] for AER 

(N3) ja + y|| < |la|] + |ly|| (triangle inequality). 

Proof. Property (N1) is trivial. Since Aw = (Aa1,..-,A@n), we have ||Ar||3 = 


(Avi)? +... + (At)? = |Al? - ||a|]3, which proves (N2). For the proof of (N3) 
we compute 


n 
lc + yll3 = S_ (ee + yx)? =k t2 mnt 


k=1 
S lel +2) + |lylla = “(els a3 lal? 


Remark. In the above proof, we have used the estimate 


(1.5) 


n 
2 
5 YED 
k=1 


which is known as the Cauchy-Schwarz inequality. It is obtained from )>)_, (a. — 
VYx)? > 0 in exactly the same way as (III.5.19). With the notation 


(1.6) (ay) = D0 reyes 


for the scalar product of the two vectors x and y, inequality (1.5) can be written 
more briefly as 


(1.5') (x, ¥)] < Ilelle - Ilylle. 


In the sequel, we rarely need the explicit formula of Eq. (1.4). We shall usu- 
ally just use the properties (N1) through (N3). 


(1.2) Definition. A mapping || - || : R” — R, which satisfies (N1), (N2), and (N3), 
is called a norm on R”. The space R”, together with a norm, is called a normed 
vector space. 
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Examples (Jordan 1882, Cours d’Analyse, vol. I, p. 18, Peano 1890b, footnote on 
p. 186, Fréchet 1906). Besides the Euclidean norm (1.4), we have 


(1.7) lvl; = S— [axl a-norm, 
k=1 


(1.8) I|ZIloo = 


maximum norm, 


| 
B 
© 
a 
S 
bse 


(1.9) Ilzllp = 


| 
ic 
= 
= 
Tg 
SS 
a 
a: 
aS) 
3 
iF 
5 
° 
= 
i=! 
3 
IV 
re 


The verification of properties (N1) and (N2) for all these norms and the verifica- 
tion of (N3) for (1.7) and (1.8) are easy. We will see later (“Hélder’s inequality”, 
see (4.42)) that the triangle inequality (N3) also holds for (1.9) for any p > 1. 
(1.3) Theorem. For any x € R”, we have 


(1.10) Il2lloo < llal]2 < lal, < 7 - [latlloo- 


Proof. We only prove the second inequality (the proof of the others is very 
easy and therefore omitted). Taking the square ||z||? in Eq. (1.7) and multiplying 
out, we obtain the sum of squares > x? (which is ||2||3) and the mixed products 
|x;,| - |az|, which all are non-negative. This implies that |||? > ||2||3. 


Each of these norms can be minorized or majorized (up to a positive fac- 
tor) by each of the others. This shows that the norms ||z||1, ||a||2, and ||z||.. are 
equivalent in the sense of the following definition. 


(1.4) Definition. Two norms || - ||, and || - ||, are called equivalent if there exist 
positive constants C', and C2 such that 


(1.11) Cy: ||ZIlp < |lellq < Co- |IzIlp forall «x €R”. 


Convergence of Vector Sequences 


Our next aim is to extend the definitions and results of Sect. III.1 to infinite se- 
quences of vectors. We consider {ti }id1, where each 7; is itself a vector, i.e., 


(1.12) Li = (214, Lai, --- Eni); = 1283 sa . 
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(1.5) Definition. We say that the sequence {x;}i>1, given by (1.12), converges to 
the vector a = (a4, 42,...,@n) € R” if 


Ve>0 JIN>1 Vit>N |la;,-all <e. 


As in the one-dimensional case, we then write lim x; = a. 


41—0O 
x. * 
217 x, 
X55¢ * 
22 
xX 
——_— * 
* * 
#__4hy 
ay ami ed * 
= ele | 
| 
rs * 
$ * * 
44 ol 
X13 X19 ay X11 


FIGURE 1.3. Convergent sequence in R? 


This is exactly the same definition as in (III.1.4), except that “absolute val- 
ues” are replaced by “norms”. 


(1.6) Remark. In order to be precise, one has to specify the norm used in Defi- 


nition 1.5, e.g., the Euclidean norm. But if || - ||,, is equivalent to || - ||,, then we 
have 
(1.13) convergence in ||- ||, <> convergence in || - |lq- 


Indeed, ||x; — al) < € and (1.11) imply that ||x; — all, < Coe. Since e > 0 


is arbitrary in Definition 1.5, we can replace it by «’ = Cye and we see that 
convergence in || - ||,, implies convergence in || - ||. 
Theorem 1.3 shows that || - |[1, || - |2, and || - ||. are equivalent, and later 


(Theorem 2.4) we shall see that all norms in R” are equivalent. Therefore, we 
may take any norm in Definition 1.5 and the convergence of {;} is independent 
of the chosen norm. 


(1.7) Theorem. For a vector sequence (1.12) we have 


lim 2; =a => lim tj =a, for k=1,2,...,n, 
i—00 i000 


i.e., convergence in R” means componentwise convergence. 
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Proof. For the maximum norm (1.8) we have 


(1.14) ||z; — allo <eé => |e —An|<e€ for kK=1,2,...,n. 


On choosing |] - ||.o in Definition 1.5, we obtain the statement. 


With these preparations, it is easy to transcribe the other definitions and re- 
sults of Sect. III.1 to the higher dimensional case. For example, we call a sequence 
{i }i>1 of vectors bounded, if there exists a number B > 0 such that ||z;|| < B 
for all i > 1. Again, boundedness is independent of the chosen norm. As in The- 
orem III.1.3, we see that convergent vector sequences are bounded. 

A sequence {2;};>1 is called a Cauchy sequence if 


(1.15) Ve>0 IN>1 Vi>N VE>1 ~~ |x; —zi4el| <e. 


Using the maximum norm in (1.15), this is seen to be equivalent to the fact that, for 
k =1,...,n, the real sequences {Cri }i>1 are Cauchy sequences. Consequently, 
we immediately obtain the following extension of Theorem III.1.8. 


(1.8) Theorem. A sequence of vectors in R” is convergent, if and only if it is a 
Cauchy sequence. 


The generalization of the Bolzano-Weierstrass theorem is somewhat more 
complicated. 


(1.9) Theorem (Bolzano-Weierstrass). Every bounded sequence of vectors in R” 
possesses a convergent subsequence. 


Proof. Let {x;};>1 be our bounded sequence. We first consider the sequence 
{tiu}i>1 of first components. It is also a bounded sequence, and by Theorem 
III.1.17, we can extract a convergent subsequence, say, 


(1.16) L141, 11,5, ©1,9, £1,22, 1,37, T1,58, 11,238, 11,576, «++ - 


We then consider the second components. The main idea, however, consists in 
considering them only for the subsequence corresponding to (1.16) and not for 
the whole sequence. This sequence is bounded, and we can again apply Theorem 
III.1.17 to find a convergent subsequence, say, 


(1.17) 2,1, ©2,9, ©2,58, ©2576, «++ 


Now, the sequence 21, %9, X58, Y576,--- converges in the first and in the second 
component. For n = 2 the proof is complete. Otherwise, we consider the third 
components corresponding to (1.17), and so on. After the nth extraction of a sub- 
sequence, there are still infinitely many terms left and we have a sequence that 
converges in all components. 
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Neighborhoods, Open and Closed Sets 


By “set” we mean the entity MM formed by gathering together certain defi- 
nite and distinguishable objects m of our intuition or of our thought. These 


objects are called the “elements” of M. 


(G. Cantor 1895, Werke, p. 282) 


No one shall expel us from the paradise that Cantor has created for us. 
(Hilbert, Math. Ann., vol. 95, p. 170) 


A new mathematical era began when Dedekind (about 1871) and Cantor (about 


1875) considered sets of points as new mathematical objects. 
For sets A, B in R” we shall use the symbols 


(1.18) 
(1.19) 
(1.20) 
(21) 
(1.22) 


ACB if all elements of A also belong to B, 


ANB={xrel 
AUB= {axel 
A\B={vel 

CA={rel 


R";x€A and we Bh, 


R”;2E€A or we B}, 


The role of open intervals is played by 


(1.23) 


R”; cE A but «x ¢ B}, 
R”; « g A}. 


B.(a) = {w ER"; |je— all <e}, 


which we call a disc (or ball) of radius € and center a (see Fig. 1.4). 


pHi. 


p=15 


p=2. 


cy 


p= 100. 


FIGURE 1.4. Discs of radius ¢ = 1,1/2,1/4 for ||a||p, p= 1, 1.5, 2, 3, 100 


(1.10) Definition (Hausdorff 1914, Chap. VII, §1; see also p. 456). Leta € R” be 
given. A set V C R” is called a neighborhood of a, if there exists an € > 0 such 


that B-(a) CV. 


The discs B. (a) depend on the norm (||-||1, ||-||2, or ||-||oo, - - .); the definition 
of a “neighborhood”, however, is independent of the norm used, provided that the 
norms are equivalent. Each B.(a) corresponding to one norm will always contain 


a B.- (a) for any other norm (Fig. 1.5). 


(1.11) Definition (Weierstrass, Hausdorff 1914, p.215). A set U C R” is open 


(originally: “ein Gebiet”) if U is a neighborhood of each of its points, i.e., 


U open —> 


Va 


e€U de>0 B(x) CU. 
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S 


FIGURE 1.5. Neighborhoods 


(1.12) Definition (G. Cantor 1884, p.470; see Ges. Abhandlungen, p.226). A set 
FC R” is closed if each convergent sequence {x;};>1 with x; € F has its limit 
point in F, i.e., 


F closed => a=lima; and «,¢€F imply acF. 


Examples in R. The so-called “open interval” (a,b) = {x € RR; a< a < b}isan 
open set. Indeed, for every x € (a, b) the number ¢ = min(a — a, b— z) is strictly 
positive and we have B-(x) C (a,b). On the other hand, the sequence {a + 1/i} 
(for i > 1) is convergent, its elements lie in (a, b) for sufficiently large 7, but its 
limit is not in (a, b). Therefore, the set (a, b) is not closed. 

The set [a,b] = {a € R; a < x < 0b} is closed (see Theorem III.1.6). 
However, neither a nor b have a neighborhood that is entirely in [a,b]. Hence, 
(a, b] is not open. 

The interval A = [a, b) is neither open nor closed, because a has no neigh- 
borhood lying in [a, b) and the limit of the convergent sequence {b — 1/i} is not 
in [a, b). 

Finally, the set R = (—co, +00) is both open and closed, and so is the empty 
set 0). 


(1.13) Lemma. 
a) Theset A={x ER"; ||x|| <1} is open. 
b) Theset A={x ER"; ||x|| <1} is closed. 


Proof. a) For a € A we take e = 1 — ||a||, which is positive. With this choice, we 
have B.(a) C A (see Fig. 1.6), since, with the use of the triangle inequality, we 
have for x € B-(a) that 


|||] = lla —a + al] < lle — all + llall < e+ lal] = 1. 


Hence, A is open. 
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@ ® © 8 a 


p=i. pH=i5 p=2. p=3. p= 100. 
FIGURE 1.6. Open sets {x € R*; |lal|p < 1} 


GS EH 


FIGURE 1.7. Closed sets. {a € R’; ||x\|p < 


b) Consider a sequence {x;};>1 satisfying x; € A (for all 7) and converging 
to a. We have to show that a € A. Suppose the contrary, a ¢ A (ie., ||a|| > 1, 
see Fig. 1.7), and take ¢ = ||a|| — 1. For this € there exists an N > 1 such that 
||a; — al| < e fori > N. Using the triangle inequality (or better yet Exercise 1.1), 
we deduce 


I|zal] = lle — a + all 2 |lel| — lla — all > lal -e=1 


for sufficiently large i. This contradicts the fact that x; € A for all 7. Hence, 
A= {x €R"; |z|| < 1} is closed. 


Further Examples. The set A = {x € R?; 21,22 € Q, |la|| < 1} is neither 
open nor closed. Indeed, each disc contains irrational points and a limit of rational 
points can be irrational. 


ui 


FIGURE 1.8. Cantor set 


The famous Cantor set (1883, see Werke, p.207, Example 11; Fig. 1.8) is 
given by 


A = [0,1] \ {(1/3, 2/3) U (1/9, 2/9) U (7/9, 8/9) U...} 
ae) = {a — as ; a, € {0,2}}. 


It is not open (e.g., = 1/3 has no neighborhood in A), but is closed (see Remark 
1.16 below). 
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“Sierpinski’s triangle” (Fig. 1.9) and Sierpiriski’s carpet (Fig. 1.10) (Sierpinski 
1915, 1916) are bidimensional generalizations of Cantor’s set. The drawings in 
Figs. 1.9 and 1.10 are not only charming because of their aesthetic appeal, but 
remind us as well that sets can be rather complicated objects. 


FIGURE 1.9. Sierpinski’s triangle FIGURE 1.10. Sierpiriski’s carpet 


(1.14) Theorem. We have 


i) F closed => (CF open, 
ii) U open => CU closed. 


Proof. i) Suppose that CF is not open. Then there exists an a € CF (ie., a ¢ F) 
such that for all ¢ > 0 we have B-(a) ¢ CF. Taking « = 1/i, we can choose a 
sequence {x;};>1 satisfying x; € F and ||x; — al| < 1/i. Since F is closed, we 
have a € F’, a contradiction. 

ii) Suppose that CU is not closed. This means that there exists a sequence 
x; € GU (ie., 2; Z U) converging to an a ¢ CU, (i.e., a € U). Since U is open, 
we have B-(a) C U for ane > 0. Thus, x; ¢ B-(a) for all i, a contradiction. 


(1.15) Theorem (Hausdorff 1914, p.216). For a finite number of sets, we have 

i) U,,U2,...,Um open = > U,NU2N...AUm is open, 

ii) F,,Fo,...,Fm closed => F,UF)U...U Fy is closed. 

For an arbitrary family of sets (with index set A), we have 

iii) Uy openforallX = Ue, Uy = {4 € R"; JX EA, & € Uy} is open, 
iv) F) closed forallX => (\ye, Px = {@ € R®; VAE A, & © Fh} is 
closed. 
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FIGURE 1.11. Open sets with closed FIGURE 1.12. Closed sets with open union 
intersection 


Proof. We begin with the proof of (i). Let € UjN...1Um so that x € U;, for all 


k = 1,...,m. Since U, is open, there exists an ¢, > 0 such that Bz, (a) C Uk. 
With « = min(€1,...,€m), we have found a positive € such that B.(a) C UM 
vet Vays 


The proof of (111) is even easier and hence omitted. The equivalences (i) = 
(ii) and (iii) < (iv) are obtained from the “de Morgan rules” 
C(U, NU2) = (CU) U (CU) 


1.25 
cae, C(U, U U2) = (U4) A (CUR), 


together with Theorem 1.14. 


(1.16) Remark. With this theorem, we see that the Cantor set of Eq. (1.24) is 
closed. Indeed, its complement 


CA = (—00,0) U (1, 00) U (1/3, 2/3) U (1/9, 2/9) U (7/9, 8/9) U... 
is an infinite union of open intervals and thus open by Theorem 1.15. 


(1.17) Remark. The statements (i) and (ii) of Theorem 1.15 are not true in general 
for an infinite number of sets. 
Consider, for example, the family of open sets 


(1.26) Us {ceR?; \la|| < 1+1/ih, 


whose intersection U2 U3 Usn... = {x € R?; ||a|| < 1} is not open 
(Fig. 1.11). 
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Similarly, the family of closed sets (Fig. 1.12) 
(1.27) Fi = {xe R?; lla] <1-1/1} 


has a union F2 U F3 U FyU... = {x € R?; ||x|| < 1}, which is not closed. 


Compact Sets 


We have already pointed out and will recognize throughout this book the 
importance of compact sets. All those concerned with general analysis have 
seen that it is impossible to do without them. 

(Fréchet 1928, Espaces abstraits, p. 66) 


(1.18) Definition (Fréchet 1906). A set kK C R” is compact if for each sequence 
{vi }id1 with elements in K there exists a subsequence that converges to some 
elementa € K. 


(1.19) Theorem. For K C R” we have 


EK compact => EK bounded and closed. 


Proof. Let K be bounded (i.e., ||z|| < B for all 2 € A’) and closed. We then take 
a sequence {vi pid1 with elements in A. This sequence is bounded and has, by 
Theorem 1.9, a convergent subsequence. The limit of this subsequence lies in kK, 
because Kt is closed. Hence, K is compact. 

On the other hand, let AK be a compact set. This implies that K is closed, 
because every subsequence of a convergent sequence converges to the same limit. 
In order to see that K is bounded, we assume the contrary, i.e., the existence of 
a sequence {x,} satisfying 7; € K for all é and ||x,;|| — oo. Obviously, it is 
impossible to extract a convergent subsequence, so that K cannot be compact in 
this case. 


(1.20) Remark. Compact sets are, by Definition 1.18, precisely the sets in which 
the Bolzano-Weierstrass theorem can be applied. Since this theorem is the basis 
for all deep results on uniform convergence, uniform continuity, maximum and 
minimum, Fréchet was not exaggerating (see quotation). 


(1.21) Theorem (Heine 1872, Borel 1895). Let K be compact and let {Uy}yea 
be a family of open sets Uy with 


(1.28) U U, Dd K (open covering). 
AEA 
Then, there exists a finite number of indices 1, 2,..., Am such that 


Uy, UU), U...UU),, DK. 


284 IV. Calculus in Several Variables 


Counterexamples. Before proceeding to the proof of this theorem, we show that 
none of the assumptions may be omitted. 
In the example 


K = {x; ||xl| <1}, Ua {a3 le <1 14h, PSC Oe 2. 


it is not possible to find a finite covering of K’. This is due to the fact that K is not 
closed. 
In the situation 


K=R’", C= a5" el ea, VST, Dice. 5 


the set KK is not bounded. Again, it is not possible to find a finite covering of I. 
Hence, the boundedness of Kk is essential. 

In our last example, we consider the compact set K = {x ; ||x|| < 1}, but 
we consider nonopen sets U; given by 


: veee. 
Ql > on =H 


U; = { (rcosy,rsiny);0<r<1, 


None of the U; is superfluous in the covering {U;};>1 (Fig. 1.13). 


a": 


FIGURE 1.13. Non open covering of K FIGURE 1.14. Heine’s proof 


Proof. Following Heine (1872), we enclose the compact set / in an n-dimensional 
cube I (a square for n = 2; see Fig. 1.14). Suppose that we need an infinite number 
of U) to cover K. The idea is to split J into 2” small cubes by halving its sides 
(here, 1), [2, [3, [4). One of the sets kK M1; (7 = 1,..., 2”) requires an infinite 
number of U) in order to be covered. We assume that this is K MJ, and denote it by 
i,. Again we split Zz into 2” small cubes, and so on. We thus obtain a sequence 
of sets 


KDkK, D> Kke2DK3)2D..., 


each of which requires an infinite number of U) in order to be covered. 


IV.1 Topology of n-Dimensional Space 285 


In each K;, we choose a x; € K;. The sequence {x;} is a Cauchy sequence, 


because the diameter of the A; tends to zero. Therefore (Theorem 1.8), it con- 
verges and we denote its limit by a. Since K is compact (hence closed), we have 
a € K. By (1.28), there exists a A with a € U). Since this U) is open, there exists 
ane > 0 with B.(a) C U). Using again the fact that the diameter of the ; tends 
to zero, we conclude that for sufficiently large m we have K,, C B-(a) C Uj. 
Hence, KK, is covered by one single Uy. This contradicts the assumption that 


cannot be covered by a finite number of U). 


Exercises 


1.1 


1.2 


1.3 


1.4 


1.5 


Let || - || be a norm on R”. Prove that 


| Hell — Ilyll | < lz -gll- 


Hint. Apply the triangle inequality to ||z|| = ||ja — y + y]|. 
Show that 
I|z\l2 < |lall1 < Va- |l2Il2 Va eR". 


Show that these estimates are “optimal”, i.e., if 
e+ |[zIl2 < |lzlli < C- ||2II2 Vaz eR", 


then c<1 and C> Vn. 
Mr. C.L. Ever might have the idea of defining the “norm” 


Ize = (Solesl¥?)" 
i=l 


Show that this “norm” does not satisfy the triangle inequality. Study also the 
set B= {x € R? ; ||a|]1/2 <1} and show that it is not convex. 


For each set A in R” define the interior A° of A by 


A= {x | A is neighborhood of x} 
and the closure A of A by 
A={z | A meets every neighborhood of x}. 


Show that A is a closed set (in fact the smallest closed set containing A) and 
that A is an open set (the largest open set contained in A). 
Show that for two sets A and B in R” 


AUB=AUB, 


Find two sets A and B in R for which 
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ee Be oN lo ° 
ANBAZANB AUBFAUB. 


1.6 (Sierpiriski’s Triangle 1915). Let a, b, cbe three points in R? forming an equi- 
lateral triangle. Consider the set 


Ai SS Hi i 
T={dat+pbt+ve; A= IS, p= Se v=y ot, 
i=1 


i=1 i=1 
where Aj, 4;, 4; are 0 or 1 such that A; + uw; + 1; = 1 for all 7. Determine the 
shape of T’. Is it open? Closed? Compact? 
1.7. Show that 
1 2 
lll = (lel + Jeol) + = max{|a|, [eal} 


is anorm on R?. Determine for this norm the shape of the “unit disc” 
By (0) = {x ER? ; |x| < 1}. 


1.8 Show that the map N : R? — R defined by 


N(21, 22) = \/axt + 2br, x2 + cx2 


is anorm on R? if and only if a > 0 and ac — b? > 0. 


1.9 Deduce the Bolzano-Weierstrass theorem from the Heine-Borel theorem. 
Hint. Suppose that {2,,} is a sequence with ||x,,|| < MM, with no accumula- 
tion point. Then, for each a with ||a|| < 1 there is an e > 0 such that B.(a) 
contains only a finite number of terms of the sequence {,, }. 


1.10 Prove that R” and @ are the only subsets of R” that are open and closed. 
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... according to the judgment of all mathematicians, the difficulty that read- 
ers of this work experience is caused by the more philosophical than mathe- 
matical form of the text . . .. Now, to remove this difficulty was an essential 
task for me, if I wanted the book to be read and understood not only by 
myself, but also by others. 

(Grassmann 1862, “Professor am Gymnasium zu Stettin’’) 


Let A be a subset of R”. A function 
(2.1) f:A—R™ 

maps the vector = (%1,...,%) € A to the vector y = (y1,.-.,Ym) € R™. 
Each component of y is a function of n independent variables. We thus write 

yi = fil(ti,.--,2n) 

(2.2) y=f(e) or 


aN 
Wy , 
ASK) I H ay 
NROQKRYOR “Ui 
ee ag I 
} iY || 
a ~X, 
x, 
FIGURE2. 1a. The function y = x} + 73 PIOURe Aue yi = cos 10x, yo = sin 102, 
SaaS 


Examples. a) One function (m = 1) of two variables (n = 2) can be interpreted 
as a surface in R*. For example, the function y = x7 + x3 represents a paraboloid 
(Fig. 2.1a). 

b) Two functions (m = 2) of one variable (n = 1) represent a curve in R®. 
For example, the spiral of Fig. 2.1b is given by y; = cos 10z, yo = sin 10z. If we 
project the curve onto the (y1, y2)-plane, we obtain a “parametric representation” 
of a curve in R? (in our example a circle). 


(2.1) Definition. A function f : A— R™, A C R” is continuous at xo € A if 
Ve>0 3d6>0 VaEA: |lx—aoll <5 | f(x) — f(ao)|| <e. 
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This corresponds exactly to Definition III.3.2 with absolute values replaced 
by norms. Our definition does not depend on the particular norms chosen, as long 
as they are equivalent (by the same argument as in Remark 1.6). If we use the 
maximum norm in R””, we find, in analogy to Theorem 1.7, the following result. 


(2.2) Theorem. A function f : A — R™, A C R” given by (2.2) is continuous 
at x9 € A if and only if the function f; : A — R is continuous at xo for all 
yn lee 


As a consequence of this theorem, only the case m = 1 has to be considered 
for the study of continuity. A constant function f(x) = cis obviously everywhere 
continuous. The projection of 7 = (x1,...,@n) to the kth coordinate, i.e., p(x) = 
Xp, is also continuous at every point zo = (210,.--,2no), since |w_, — rol < 
|x — xo|| (choose 6 = ¢ in Definition 2.1). 

It is almost trivial to generalize the Definition III.3.10 of the limit of a func- 
tion and the statements of Theorems III.3.3 and III.3.4 to the case of several 
variables as long as the product and the quotient make sense (just replace ab- 
solute values by norms). Consequently, polynomials of several variables, e.g., 
f (x1, 22,73) = x}x3—a1x323+42x3—1, are continuous everywhere, and rational 
functions are continuous at points where the denominator does not vanish. 


FIGURE 2.2. Stereogram for discontinuous function f (#1, x2) of Eq. (2.3) (hold the picture 
close to the eyes (20 cm) and stare through the paper to an object 20 cm behind it. Then the 
two images will merge and become 3D) 


Example. Consider the function f : R? — R, given by 


L1X2 . 
eo if xv + v2 > 0 
(2.3) y = f(a1,22) = § ei +95 nites 
0 if t1 = %Q= 0 


(see Fig. 2.2). It is continuous at points satisfying x? + x3 > 0. In order to explain 
its behavior close to the origin, we use polar coordinates x1 = rcosy, x2 = 
r sing so that (for r > 0) 


r-cosysing 1. 
y = ——_ = 5 sin 2. 
(a 2 
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Hence, the function is constant on lines going through the origin, with the con- 
stant depending on the angle y. In each neighborhood of (0, 0), the function (2.3) 
assumes all values between +1/2 and —1/2. Therefore, it cannot be continuous 
at (0,0). 

The interest of this example is that the partial functions x; +> f(x1,0) and 
xq +> f(0,x2) are continuous also at the origin. Therefore, there is no analog 
of Theorem 2.2 for the independent variables x, as Cauchy (1821, p.37) actu- 
ally thought. He was corrected, with the above counterexample, by Peano (1884, 
“Annotazione N. 99”). 


Continuous Functions and Compactness 


We continue extending the results of Sect. III.3 to functions of several variables. 
Many of these extensions are straightforward. For example, the analog of Theo- 
rem III.3.6 is as follows: 


(2.3) Theorem. Let kK C R” be a compact set and let f : K — R be continuous 
on K. Then, f is bounded on K and admits a maximum and a minimum, i.e., there 
exists u€ K andU € K such that 


flu) < f(z) < fU) forall we kK. 


This theorem leads to the following result, which we already announced in 
Remark 1.6. 


(2.4) Theorem. All norms in R” are equivalent. This means that if N : R” — R 
is a mapping satisfying the conditions (N1) through (N3) of Theorem 1.1, i.e., 


(N1) N(#)>0 and N(#)=082=0, 

(N2) N(x) =|A| N(az) for AER 

(N3) N(a+y)< N(x2)+MN(y) (triangle inequality), 
then there exist numbers C, > 0 and C2 > 0 such that 


(2.4) Ci ||a\]2 < N(x) < C4||zIle forall «x €R”. 


Proof. We first show that N (a) is continuous. We write z = x1e1 + 2e2 +... + 
Lnen, where e, = (1,0,...,0), eg = (0,1,0,...,0), and so on. It then follows 
from (N3), (N2), and the Cauchy-Schwarz inequality (1.5) that 


N(x) = N(aier +... + @nen) < N(aie1) +... + N(anen) 


(2.5) 
< |ei|-N(e1) +... + |eal-N(en) < [lalla - Ce, 


with Cz = \/ N(e1)? +... + N(en)?. This proves the second inequality of (2.4). 
We now see the continuity of N’(2) as follows: 
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N(x) — N(ao) = N(x — xo + 20) — N(20) 


( 
( 


< N(x - Xo) + N(20) aa N(20) < C4\\x — Lolle, 
and similarly N(ao) — N(#) = ... < C||a0 — 2|l2, so that 
(2.6) |N (x) — N(2xo)| < Cal|z — zolle. 


We then consider the function (a) on the compact set 
K={xeR"; |lz|l2 = 1}. 

By Theorem 2.3, it admits a minimum at some u € K, ie., 

(2.7) N(z) > N(u) forall zeKk. 


Putting C; = N(u), which is positive by (N1), we have for an arbitrary « € R” 
(x # 0) that z/||z||2 € A, and hence also 


GS N(—-) = 


Ila'l]2 


This proves the first inequality of (2.4). 


Uniform Continuity and Uniform Convergence 


Exactly as in Sect. III.4, we call a function f : A — R™, A C R” uniformly 
continuous if it is continuous on A and if the 6 in Definition 2.1 can be chosen 
independently of zo € A. We have the following extension of Theorem III.4.5. 


(2.5) Theorem (Heine 1872). Let f : K — R™ be continuous on K and let 
ke C R® be acompact set. Then, f is uniformly continuous on K. 


Proof. The two proofs of Theorem III.4.5 can easily be adapted to the case of 
several variables. Let us give, for our pleasure and as an exercise, a third proof 
using Theorem 1.21 of Heine-Borel. 
We know by hypothesis that 
(2.8) 
Vao eK Ve>0 4650 VaekK : |lx—2|| <6 || f(x) — f(ao)|| < e. 


The idea is to consider the discs { Bs(xo)}2,ex aS an open covering of K and to 
extract a finite covering from it. But we will quickly realize that this will not work 
very well. Let’s be more careful. 

We fix an ¢ > 0. Then, we define for every a € K an open set 


Ua = {x ; ||x —al| < 6/2 with 6 depending on xp = a defined in (2.8) }. 


They form an open covering of ¢. Since K is compact, already a finite number 
Ua,,.--,Uay cover the set kK. With the corresponding numbers 61,...,dn, we 
define 

6 = min{6)/2, 62/2,...,5n/2}. 
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Now let « € K and y € K be arbitrary points satisfying ||a~ — y|| < 6. We 
will show that || f(a) — f(y)|| < 2e. Since x € K, there exists an index 7 with 
x € Us4,, i., ||x — a;|| < 6;/2. It then follows from || — y|| < 6 < 6;/2 and the 
triangle inequality that ||y — a,|| < 6;. From (2.8), we thus have 


IIf(x) — FMI SIF) — Fladll + Ifa) — FI <€ +e = 2¢, 


which proves the statement. 


All definitions and results of Sect. III.4 concerning uniform convergence of a 
sequence of functions carry over immediately to the case of several dimensions. 
Therefore, if a sequence of continuous functions f, : 4 — R™, A C R” con- 
verges uniformly on A to a function f(a), this limit function is continuous (a 
straightforward extension of Theorem II.4.2). Here is an interesting example. 


| 3 14 im 21 | 33 td 
4" T 13 9, 56 
; I \ 14 io | 58T st 50 
1 4 5, 3], 7 | 53_t L | 68 


1 2 | 15 ‘16 5 an 


+f +H “iy “tt CHE fae eS ood ae rls sug a 
te tee ate Tee | sauenU Se Us i oo - 
icbiecalliecarechil [pques™ ey eque s4 a . 
Bere eer 5] |p pods " 
trad rea Te A PBS oASen e509 cag ae 
fee { at rt Serres as cae 

Poh EHH fesUnnbes novos fas ze 


FIGURE 2.3. Curve of Peano-Hilbert 


Curve of Peano-Hilbert. 
A continuous curve can fill a portion of space: this is one of the most re- 
markable facts of set theory, whose discovery we owe to G. Peano. 
(Hausdorff 1914, p. 369) 
Cantor (1878) discovered the sensational result that there is a one-to-one corre- 
spondence between the points of an interval and those of a square. But Cantor’s 
mapping was not continuous. Peano (1890) then found, by a skillful manipula- 
tion of the coordinates in base 3, a continuous curve filling a whole square. Soon 
thereafter, Hilbert (1891) discovered such curves by a beautiful “geometrische 
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Anschauung”: he repeatedly divided the squares into four subsquares and labeled 
their centers consecutively by following the direction of the previous curve (see 
Fig. 2.3). 


A BA BA B 
FIGURE 2.4. Creation of Hilbert’s curve 


Another Construction. Let p(t) = (a(t), y(t)), 0 <t < 1 be an arbitrary contin- 
uous curve connecting the points A = (0,0) for t = 0 and B = (1,0) fort = 1 
(see Fig. 2.4). We then define a new curve y by 


5 (y(4t), x(4t)) ifo<t<4 

(Sy)(t) 3 (x(4t — 1),1+ y(4¢- 1)) ifi<t<# 
PRO) 11 4 plat —2),14 y(4t—2)) if Bene 8 
1(2—y(4t—3),1—2(4t—3)) if 3<t<1 


This again gives a continuous curve connecting A = (0,0) fort = 0 and B = 
(1,0) for ¢ = 1 (see second picture of Fig.2.4) so that the procedure can be 
repeated (third picture of Fig. 2.4). This leads to a sequence of functions yo = ¥, 
~1 = Lyo, y2 = Py, and so on. Whenever we start from another initial curve 
w(t) with p(t) — V(t) loo < K fort € [0,1], then ||p(t) — Su(1)|loo < K/2 
(see Fig. 2.4). It follows that 


(2.9) Ilva (t) — de (t)|| < K 27%, 
and, by putting w(t) = y(t) and K = 1, 
(2.10) Ilene (t) — Pe+m(t)l| < 27*. 


We see from (2.10) that the sequence y;,(t) converges uniformly (Cauchy’s crite- 
rion (III.4.4)), and thus has a continuous limit y(t) (Theorem III.4.2). Further, 
from (2.9) we see that the limiting function is independent of the initial function 
o(t). Hilbert’s curve from Fig. 2.3, when compared with the curves of Fig. 2.4, 
has slight modifications toward the end points of the intervals [i/4", (i + 1)/4*), 
which disappear as k — oo. 

It is interesting to note that both coordinates x(t) and y(t) are new examples 
of continuous functions that are nowhere differentiable (cf., Sect. III.9). 
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Linear Mappings 


Linear mappings are important examples of uniformly continuous functions. Let 
A be a matrix 


Q11 a12 Qin 

a21 a22 a2n 
(2.11) A= ; 

Aml1 Am2 Aas Amn 


We consider the mapping 7 +> y = Az, where 
(2.12) i= Say talcum 
j=l 


(when working with matrices it is more convenient to write vectors as column 
vectors, so that (2.12) is just the usual product of two matrices). 

(2.6) Theorem (Peano 1888a, p.454). In the Euclidean norm, we have for all 
z € R” 


(2.13) [Aa|l2<M- lalla with 


we < (Soa) (Soa), 


j=l j=l 


and summing up from 7 = 1 to m, yields the desired statement. 


As a consequence of the linearity of Ax, we get 
Ax — Aol] <M - lla — all, 


with VW given by Theorem 2.6. This shows that the mapping x + Az is uniformly 
continuous on R” (take 6 = ¢/M independent of x). 


Example. Consider the two-dimensional matrix 


_{v2+1 1 er _ 
A=( ‘ Ja), M= 6 + 2V2 = 2.9713. 
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X5 sohdaie se Won en a: 


eS) see «29. [Alla ss" 


FIGURE2.5. Majorization of a linear function 


In Fig. 2.5, we have plotted the sets {x ; ||z|]2 < 1} and {y = Az; ||z||2 < 1}. 
We see that the second set lies in a disc of radius M, confirming the estimate 
(2.13). Moreover, we observe that the value / is not optimal. 


The Matrix-Norm. The smallest number / satisfying the inequality of (2.13) is 
called the norm (or matrix-norm) of A. It is denoted by 


(2.14) ||All2 == sup{||Aa'l2 ; Ilzll2 < 1. 
Obviously, we have || Al|2 < I with the / of (2.13), and 
(2.15) | Az|l2 < || Allallall2 


for all vectors x. The precise computation of ||A]|2 involves the eigenvalues of 


A’ A and gives, for the above example, ||All2 = \/3+V2+V5+2vV2 = 
2.6855 (see Fig. 2.5 and Exercise 4.9). 


Hausdorff’s Characterization of Continuous Functions 


We are interested in a new characterization of continuity, more elegant than that 
of Definition 2.1. Instead of working with norms, we shall use neighborhoods and 
open sets. 

For a given function f : R” — R” and for sets U C R”, V C R™, we 
denote by 


(2.16) fU) = {f(z) €R"; ceu} the direct image of U, 
(2.17) f7'(V)={reER"; f(x) eV} the inverse image of V. 


(2.7) Example. We choose a function f : R? — R?, mapping (2, y) to (u,v) by 


3 
(2.18) wsats, v= (et2)y*—S(@+ly +. 
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This function is sketched for —1.1 < x,y < 1.1 in Fig. 2.6. For a subset U (light 
grey animal of the feline species ') we have drawn the set f(U) and for V (dark 
grey animal of the feline species) the set f~'(V). We observe that the inverse 
image of a connected set is not necessarily connected. This is due to the fact that 
in our example, the function f is not bijective. 


FIGURE 2.6. Direct and inverse images for the function (2.18) 


Characterization of Continuity by Neighborhoods. The set of x € R” satis- 
fying ||~ — xo|| < 6 is Bs(xo) (see Eq. (1.23)), the set of « € R” satisfying 
f(z) — yol| < € is f-1(Be(yo)). Therefore, if yo = f (ao) and A = R”, the 
condition of Definition 2.1 can be expressed by 

(2.19) Ve>0 46>0 Bs(xo) C f~*(Be(yo))- 


Since a neighborhood V of yo is characterized by the existence of an ¢ > 0 such 
that B-(yo) C V, we see that (2.19) is equivalent to the following: 


(2.20) for every neighborhood V of yo, f~1(V) is a neighborhood of zo. 


This interpretation of continuity at 79 is more elegant, and is still valid in more 
general “topological spaces”. 

A characterization in terms of open and closed sets of a function f : R” — 
R™ being everywhere continuous, is given by the following theorem. 


(2.8) Theorem (see Hausdorff 1914, p.361). For a function f : R” — R™ the 
following three statements are equivalent: 


i) _f is continuous on R”; 
ii) for every open set V C R™, the set f~'(V) is open in R” ; 
iii) for every closed set F C R™, the set f~'(F) is closed in R”. 


' Kor ApHomza. 


296 IV. Calculus in Several Variables 


Proof. (i) = (ii): let V C R™ be an open set and take x9 € f~+(V), so that 
f (xo) € V. Since V is open, it is a neighborhood of f(a) and by (2.20) f~1(V) 
is a neighborhood of zo. This is true for all ay € f~1(V). Hence, the set f~!(V) 
is open by Definition 1.11. 

(11) = (i): assuming (ii), we shall prove that f is continuous at an arbitrary 
point x» € R”. Let « > 0 be given and set yo = f (xo). The set B-(yo) is open, 
so that by assumption (ii), f~1(B- (yo)) is also open. Definition 1.11 then implies 
the existence of ad > 0 with Bs(xo) C f~'(B-(yo)). But this is simply the 
continuity of f at x (see (2.19)). 

(ii) < (iii): the equivalence of statements (ii) and (iii) follows from the iden- 
tity f-1(CV) =C(f-1(V)) and from Theorem 1.14. 


FIGURE 2.7. Inverse image for the function (2.21) 


\ 


os 


FIGURE 2.8. Inverse image for the function (2.21) 


(2.9) Example. Let f : R — R be given by f(0) = 0 and 
(2.21) f(a) = sin(1/2x?) for «#0. 


This function is discontinuous at x = 0. We shall demonstrate that for discontin- 
uous functions, (ii) and (iii) above are not true in general. 

For example, the set V = (1/3, 2/3) is open and its inverse image f~'(V) = 
(x2, %1)U(a4, £3)U. . . is also open (see Fig. 2.7). However, the set F’ = [1/3, 2/3] 
is closed, but f~!(F) = [x2, 21] U [x4, x3] U ... is not closed, because the limit 
of the sequence {;} does not lie in f~1(F). 
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For the open set V = (—1/2, 1/2) the inverse image f~!(V) = (a, 00) U 
(%2,21)U...U {0} is not open because it is not a neighborhood of 0 (see Fig. 2.8). 
On the other hand, the inverse image of the closed set F = [—1/2, 1/2], which is 
f-\(F) = [zo, 00) U [x2, 21] U... U {0}, is closed. 


(2.10) Example. Our last example illustrates the fact that Theorem 2.8 does not 
have an analog for direct images. We consider the continuous function f : R - R 
defined by (see Fig. 2.9) 


2 
(2.22) f(x) => =a 


) = (4/5, 1], which is not open; 


The image of the open set U = (3/4,2) is f(U 
= (0, 3/5], which is not closed. 


that of the closed set F' = [3, 00) is f(F) 


= —— ae = 


FIGURE 2.9. Direct images for the function (2.22) 


Integrals with Parameters 


Suppose that we have a function of two variables f(x, p) defined for x € [a, b] 
and p € [c, d]. If we integrate this function with respect to z, 


b 
(2.23) F(p) = / fey ee 


we obtain a function of p. The question is whether we can ensure that F'(p) is 
continuous. 


(2.11) Counterexamples. In formula (b) of Exercise III.5.9, we replace n? by 
1/p, and then by p: 


t/P 
(2.24) flop) = 7. p>0,0<a<1, 
x 
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and, in both cases, f(x, p) = 0if p = 0. 

In the first case, p — 0 corresponds to n — oo in Fig. III.5.5.b, hence F'(p) = 
uk f(a, p) dx will tend to a nonzero constant, whereas F'(0) = 0. We observe that 
f(x, p) is continuous everywhere except at the point « = p = 0. 

In the second case, for p — 0, the function f(x, p) represents a hump that 
flattens out to infinity while preserving the same area. Again, F'(p) is not contin- 
uous at p = 0. This time, f(x, p) is continuous everywhere, but the domain of 
integration is unbounded. 


In the case where f(x, p) is continuous everywhere and the domain of inte- 
gration is a compact interval, we know that f(z, p) is uniformly continuous (The- 
orem 2.5) and it is an easy exercise to prove (see also the proof of Theorem 3.11 


below). 


(2.12) Theorem. /f f (x, :p) is a continuous function on |a, b| x [c, d], then 


b 
rw) = [ f(x, p) dx 


is a continuous function on |c, d]. 


Exercises 


2.1 Show that there are three different values of t for which the Hilbert curve 
oo (t) is equal to (1/2, 1/2). 


2.2 Prove that the “matrix-norm” (2.14) is a norm on R”™’”” 
TO LE Pe) Spee 
ee 
PT eee 
| | |I/LE UT LIL |) teatsstantaatiq 
— ht —j 4 Beastannerasn 
| CL UL CUT) [| testaetnetis atts 
TE LIL LW LD) tsstsetstaitan 
ee ee eereresrereren penenege 


FIGURE 2.10. Peano’s curve 


2.3 a) Fig.2.10 shows Peano’s original formulas (see Peano 1890) coded and 
plotted. Give an explanation similar to that of Fig.2.4 for its construction 
(you will need an animal that connects opposite corners of a square). 

b) In the very last sentence of his paper, Peano asserts, without any further 
explanation, that x and y as functions of t have nowhere a derivative (“Ces 


2.4 


2.5 


2.6 


2.7 
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x et y, fonctions continues de la variable t, manquent toujours de dérivée’’). 
Prove this statement. 

Hint. Adapt de Rham’s proof of Theorem III.9.1 by choosing a, = 1/9", 
Bn = (i+1)/9". For these arguments the Peano curve is in opposite corners 
of a square of side 3~”, so that r, = 3”. 


Show that if AK C R” is compact and if f : K —> R” is continuous, then 
f(K) C R™ is compact. 
The function f : R? —> R defined by 


2 2 
xui-2 : 
_ + if a7 +23 >0 
f(#1,%2) = 4 at +23 
0 if 71] =%2 =0 


is discontinuous at (0,0) (why?). Find an open set U C R and a closed set 
F CR, such that f~'(U) is not open and f~+(F) is not closed. 


Define a map P : R? —> R? (which we call a projection) by 
Pi, £2) = (v1, 0). 


a) Show that P is continuous. 

b) Find an open set U Cc R? for which P(U) is not open. 

c) Find a closed set F C R? for which P(F) is not closed. 

Remark. (b) is very easy, but (c) is less easy. Because of Exercise 2.4, you 
will have to look for an unbounded F’. 


FIGURE2.11. Plot of (cos x — cos y)/(a — y) 


A naive user of a mathematical computer package (such as “Maple’’) wants 
a 3D plot of the function 


cos x — Cosy 
ty 


g(a, y) = —8<2r<8, -8<y <8, 


and obtains a result like that of Fig. 2.11. How must g be defined for x = y in 
order to obtain a continuous function? Then verify, for the function obtained, 
the conditions of Definition 2.1 for a continuous function of two variables. 
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IV.3 Differentiable Functions of Several Variables 


We Germans use instead, following Jacobi, the round 0 for partial deriva- 

tives. (Weierstrass 1874) 
Our next aim is to introduce the notion of differentiability for functions of more 
than one variable. Since a division by the vector x — x9 does not make sense, there 
is no direct way of extending Definition III.6.1. 


Partial Derivatives. If, in considering a function f : U — R,U C R", we fix 
all variables but one and regard f as a function of this single variable, we can 
apply Definition III.6.1. Consider, for example, a function y = f (21,22) of two 
variables in a neighborhood of (210, #20). We then denote the derivatives by 


lim f(ero + hy tao) = F(#10, C20) _ Ooi. £20) 

GB 1) h-0 h Ox, 
, fim (010220 + 2) — F(e10, 20) _ F g a) 
Me ye, eg eee 


and call them partial derivatives of f with respect to x; and 2, respectively. 
Other notations are vEF (x10, £20)s Dif (x10, X20)s Oif (x10, X20)s or the like. 
Geometrically, these partial derivatives can be interpreted as follows: the 
function y = f(a1,22) defines a surface in R® (with coordinates 71,72, and 
y) whose intersection with the plane x2 = 29 is the curve x1 > f (21,220). 
Therefore, the partial derivative 0 f /Ox, is the slope of this curve, and 


a) 
y = f (£10, £20) + oF (a9, 220) (01 — £10) 
Ty 


is the tangent to this curve at (x19, X20). Similarly, the tangent to the curve x2 > 
f(®10,%2) is y = f(®10,220) + Of /Ox2(a10, £20) (a2 — x20), and the plane 
spanned by these two tangents is given by 


rs) 0 
(3.2) y = f (x10, £20) + (ww, @20)(x1 — 210) + SX (0. £29) (x2 — 220). 


The function f (a1, x2) will be called differentiable at (x19, x20), if the plane (3.2) 
is a “good” approximation to f (a1, 72) in a neighborhood of (10, x29) and not 
only along the lines 7; = X19 and rg = 299. 


(3.1) Example. The surface defined by y = e~*1~*? is plotted in Fig. 3.1. The 
partial derivatives of this function are 


2 2 2 v4 
= 4 & — Te 
1772 ——(#1, 22) = —2r%e° 712, 


Ox 


——(#1, 22) = —2r1e 


Ox 


By evaluating these derivatives at (x19, 220) = (0.8, 1.0), we get the tangent plane 
at this point with the help of Eq. (3.2). It is included in Fig. 3.1. 
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FIGURE3.1. Tangent plane to the surface y = e7 ti (stereogram) 


Two Dependent Variables. In the case of two functions of two variables 


(3.3) yi = fi(x1, 22), y2 = f(x, £2), 
we write (3.2) for each of the two functions: 
(3.4) F ‘ 
yi = fi(X10, £20) + oF (a9, 220) (2 = 219) + fy (10, £20) (a2 — £20), 
1 


der 
Of 


Ofe 


yo = fo(r10, £20) + Fay (210 820) (1 = #10): + Faug (2100 220) (w2 — £20). 


This formula is conveniently written in vector notation as 


(3.4') y = f(xo) + f'(2o)(x — 20), 


where f’(ao) is now a matrix, the so-called Jacobian (see Jacobi 1841): 


(3.5) f'(xo) = Ge _ . 


2) 
522 (xo) $2(2x0) 
This notation will allow us to carry over most formulas of Sect. III.6 to the case of 


several variables. 


(3.2) Example. Consider the function f : R? — R? defined by 

fi(ai, £2) J224 a sin(21 + x2) 
3.6 => = 2 
( ) f(z) Ca J/2x +r cos(a1 — £2) 


This function sends the origin (71,22) = (0,0) to the point (y1, y2) = (0,1), 
straight lines to curves, and small squares to sets that look like parallelograms 
(see Fig. 3.2). The Jacobian for (3.6) is 


1. _ (V2 + c0s(x1 + 22) cos(x1 + x2) 
oY) fe) = ( —sin(x, 7 ma) V2+ sinh : 2)) 


and Eq. (3.4) becomes, for x9 = (0,0)", yo = f (xo), 
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yi—yio\ (V2+1 1 L1 — X10 
(3.8) = : 

Y2 — Y20 0 V2} \ x2 — 20 
The linear map given by (3.8) is precisely that of Fig. 2.5, and on comparing the 
two pictures, one can see that the nonlinear mapping (3.6) is approximated, in a 
small neighborhood of xo, by the linear map defined by the Jacobian. We observe 
that for small values of x — xo, the x-axis (i.e., 2 = a9 = O) is mapped to 
a multiple of (/2 + 1,0)" and the x-axis to a multiple of (1, /2)" (see the 


arrows in Fig. 3.2). Hence, the columns of the Jacobian matrix are the images of 
the “infinitesimal unit vectors”. 


2 

ele 
K ci x 
| FEE ext Er 

5 

x zi x 

5 

FIGURE3.2. Graph of the mapping (3.6) 
Differentiability 


... that Weierstrass’s direct teaching had the effect of discouraging the 
spontaneity of the students and was only fully understandable by those who 
had already learned the subject somewhere else. The most important trea- 
tises have been written by foreigners ... Probably the first is by my friend 
Stolz (Innsbruck): “Vorlesungen tiber allgemeine Arithmetik” ... . 

(F. Klein 1926, Entwicklung der Math., p.291) 


Let us consider a function 
(3.9) f:U-R”, U CR” 
and assume that x9 € U is an interior point of U (U is a neighborhood of x). 


(3.3) Definition (Stolz 1887, Fréchet 1906). The function (3.9) is differentiable 
at xo if there exists a linear mapping f'(ap) : R™ — R™ and a function 
r:U — R”, continuous at xo and satisfying r(x) = 0, such that 


(3.10) f(x) = f(@o) + f'(wo)(« — x0) + r(a)||2 — oll. 
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(3.4) Remark. If a function is differentiable at xo, then it is continuous at this 
point. Furthermore, all its partial derivatives exist at x9. This follows from the fact 
that for x — xo = he; (where e; = (0,...,0,1,0,...,0)7 with the jth component 
equal to 1) Eq. (3.10) becomes 


f (xo + he;) — f (xo) 
h 


h 
(3.11) = f'(xo)e; + r(zo + he) at 


Since r(x) is continuous at xo, the limit of this expression exists for h — 0 and is 
equal to 


Fel) =F ao)e; whence 5 (aa) = f(eo)e; 


(here, f(x) = (fi(x),---, fm(x))). Consequently, the linear mapping is unique. 


The analog of Carathéodory’s formulation (Eq. (6.6) in Sect. III.6) is given 
by the following lemma. 


(3.5) Lemma. The function f (x) of (3.9) is differentiable at xo if and only if there 
exists a matrix-valued function p(x), depending on xo and continuous at xo, such 
that 


(3.12) f(x) = f(a) + e(@)(@ — x0). 
The derivative of f(x) at xo is given by f' (ao) = y(a0). 


Proof. For a given function p(x) we put 


(x — x0) 
f'(xo) = 9(t0), r(x) = (Y(2) — ¥(20)) 
eal 
and we see that (3.10) holds. Since (x — x9)/||a — xo|| is bounded by 1, it follows 
from the continuity of y(a) at xo that r(x) — 0 for x — ao. 
On the other hand, assume that (3.10) holds. We define v(x) := f’(zo), 
and, for x # 2, 
(x — x9)" 


(3.13) (2) = f'(«o) +r(@) 


lla — ol 

(observe that the product of the column vector r() with the row vector (x — x9)" 
yields a matrix), and obtain y(a)(x— 29) = f’(ao)(a — x40) + r(x)||a — xo||. The 
function y(a) is continuous at zo because, by Theorem 2.6, ||y(x) — f’(xo)|| < 
||r(a)|], and ||r(a)|| — 0 for > ao. 


The following result gives a sufficient condition for differentiability, which 
can be checked by considering partial derivatives only. 
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(3.6) Theorem. Consider a function f : U — Rand xo € U (interior point). If 
all partial derivatives 0 f /Ox; exist in a neighborhood of x and are continuous 
at xo, then f is differentiable at xo. 


Proof. We shall give the proof for the case n = 2. The extension to arbitrary n is 
straightforward. The idea is to write f(a) — f(x) as 


f (21,22) — f (x10, £20) = (f (a1, 22) — f (10, 22) + (f (x10, 22) — f (10, £20)) 
and to apply Lagrange’s Theorem III.6.11 to each of the differences. This yields 


f(x1, 22) — f(®10, £20) = SH (E1saa)(or — #19) + SH (wi0s€2) 2 — 20). 


Putting (21,22) = (sa (6. x2), sh (ew, 2), we have established (3.12). 


The continuity of y(2) at xo follows from the assumptions. 


By Definition 3.3, a vector-valued function f(x) = (f1(a),-.., fm (x))* is 
differentiable at xo if and only if f;(x) is differentiable at xo for alli = 1,...,m. 
It thus follows from Theorem 3.6 that functions whose components are polynomi- 
als in 71,...,@p, rational functions, or, elementary functions are differentiable at 
points where they are well-defined. 


Counterexamples 
Discontinuous Function Whose Partial Derivatives Exist Everywhere. Con- 
sider the function f : R? — R, given by 
L122 
(3.14) f(a1, 2) = ai + 23 
0 if v1 = %Q= 0 


if c}+23 >0 


(see Fig. 2.2). The partial derivatives vanish at the origin, because f (21,0) = 0 
for all x; and f (0,22) = 0 for all x2. Away from the origin, the existence of the 
partial derivatives is clear. Nevertheless, the function (3.14) is not continuous at 
the origin (see Sect. IV.2). 


Discontinuous Function Whose Directional Derivatives Exist Everywhere. 
Partial derivatives are special cases of the so-called directional derivatives. Con- 
sider a function f : R? — R and a vector v of length 1 (||v|/): = 1). Then 
g(t) := f (ao + tv) represents the curve formed by the intersection of the surface 
y = f(x1, £2) with the vertical plane {(a, y) | z = xo + tu,t € R}. Its derivative 
is denoted by 


(3.15) Bp (20) := lim 


and is called the directional derivative of f (in direction of v ). Partial derivatives 
are obtained for v = (1,0) and v = (0,1)?. 
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Consider the function 


2229 
(3.16) f(a1,22)=4% af +02 
0 if CH= %Q= 0. 


if 7 +23 >0 


For v = (cos 6, sin 0)" we get 


t cos? 0 sin @ 
até) = F(t) = t2 cos# @ + sin? 0 
This function is differentiable at t = 0 for any value of 6 (observe that for sin? = 
0 we have g(t) = 0 for all t). Hence, all directional derivatives exist. However, on 
the parabolas x2 = ax7 the function is constant, namely f (x1, ax?) = a/(1+a7), 
and all values between —1/2 and 1/2 are assumed in each neighborhood of the 
origin (see Fig. 3.3). Thus, it is not continuous there. 


FIGURE 3.3. The function (3.16) (stereogram) 


A Geometrical Interpretation of the Gradient 


For a function f : U — R, ice., the case m = 1 and n arbitrary, the matrix f’ (29) 
of (3.5) is a row vector. It is usually denoted by 


Lf Of OF OFM fa 6) O = 
(3.17) grad f = (a One a) = ‘Ca Ory” Saf = VE 


Here, the formal vector (Hamilton 1853, art. 620) 
_ ( 6) O O ) 
~ \@a1? O22?" Oxy 


is called Nabla “owing to its fancied resemblance to an Assyrian harp” (J.W. Gibbs 
1907, p. 138). Equation (3.10) then becomes 


(3.18) F(x) = f(wo) + grad f (eo) - (w — 20) + r(2) lle — oll, 


and the equation y = f(a) + grad f(ao) - (a — xo) of the tangent plane to the 
surface y = f(a) (see (3.2)) appears again. 
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In order to investigate the function f(a) in a neighborhood of xo, we put 
x =X + tv and neglect the last term in (3.18). This yields 


(3.19) f(vo + tv) = f(wo) + tgrad f(ao)-u+.... 


Assuming that v is a vector of length 1, we can deduce the following properties: 


e The vector grad f (aq) is orthogonal to the level curve {2 ; f(a) = f(ao)}. 
This follows from (3.19) if we let t — 0, because f (xo + tv) = f (xo) implies 
grad f(a) -v = 0. 

e The function increases in directions v where grad f (xo) - vu > 0. Because of 
the Cauchy-Schwarz inequality (1.5), v = grad f(zo)/|| grad f(xo)]|| is the 
direction in which f(a) increases fastest. The direction of steepest descent is 
the opposite vector v = — grad f(xo)/|| grad f(2o)||. 

e If f(x) has a maximum (or minimum) at 9, then we get the necessary condi- 
tion grad f (xo) = 0. 


FIGURE 3.4. Level curves and gradients for the function (3.20) 


Fig. 3.4 shows the level curves f(x) = C (with C = 1/20;7 = 1,...,30) 
for the function 


(3.20) f(x1,"2) = x? — 4x, 22 +522. 


Its gradient grad f (#1, 72) = (241 — 4x2, —4a1 + 1022) is indicated by arrows. 
We observe that the gradient is orthogonal to the level curve and that the length of 
grad f(a) indicates the steepness of the surface y = f(x). 


The Chain Rule. We consider two functions 


pt , pm 5, op 
GC Fa Y > 2 


and study the differentiability of the composed function (g o f)(x) = g(f(z)). 
As in Sect. III.6, we use Carathéodory’s characterization (here Lemma 3.5). As- 
suming that f is differentiable at xo and g at yo = f(a), we have 
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f(x) = f(xo) + p(a)(z@— 20), gy) = 9(yo) + Y(y)(y — Yo)- 


Putting y = f(x), yo = f (xo) and inserting the first equation into the second one, 
we obtain 


(3.21) 9(f(x)) = 9(f(xo)) + b(f(z)) 9(2)(x — 20). 


Since the product «)(f(x)) (x) is continuous at xo, the derivative of g 0 f is this 
expression evaluated at 29, i.e., 


(3.22) (90 f)'(o) = 9'(yo) - f’ (£0). 


Written in coordinates, the product (3.22) becomes 


Oz; = ue Oz; OY; 
(3.23) OrR y Oy; On," 


which generalizes Leibniz’s formula (Eq. (II.1.16)). 


Lig ee ey ! f f | 
t -8 -6 -4 -2 /[ 2 A 6 Yy 
/ 


FIGURE 3.5. Movement of an elastic pendulum 


Example. Suppose that the motion of an elastic pendulum is given in polar coor- 


dinates f(t) = (r(t), y(t)” 
cartesian coordinates 


x a _ Tr COS ~ 
(3.24) (F) =a = (78), 


we have to differentiate x and y with respect to t. Since the Jacobi matrix of (3.24) 
is given by 


, see Fig. 3.5.1 If we want to know the velocity in 


' The curves of this figure are the solutions of differential equations and were calculated 
by numerical methods (see Sect. II.9). 
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, _ [cosy —rsiny 
(3.25) g (7p) = ee r cosy ) 


we obtain from (3.22) that 
&=cosy:r—rsng:¥, y=sing-r+rcosyp:y 


(the derivative with respect to time t is denoted by a dot). This permits, for exam- 
ple, the computation of the kinetic energy 


TH = Ze +P) = 7 +e). 


The Mean Value Theorem 


We wish to generalize the formula f(b) — f(a) = f’(€)(b — a) of Lagrange’s 
Theorem (Sect. III.6) to several variables. 


The Case m = 1. Consider a function f : R” — R and let two points a € R” 
and b € R” be given. The idea is to connect these points by a straight line 


x=a+t(b—alt, 0<t<l b, t=1 
and to put 


g(t) = f(at (b—a)t). 


If f(a) is differentiable at all points of the seg- 
ment {a+ (b— a)t; t € (0,1)}, g(t) is also a, b, 
differentiable, and it follows from (3.22) that 


g(t) = f' (a+ (b= a)t) (b= a). 
, g(1) = f(b), Theorem III.6.11 applied to the function g(t) 
g'(r)(1 — 0), and hence also 


a, t=0 


Since g(0) = f(a 
gives g(1) — g(0) 
(3.26) f(b) — f(a) = f'(E)(6— a), 


where € is a point on the segment connecting a and b. Equation (3.26) looks like 
(III.6.14), but here f’(€)(b — a) is the scalar product of two vectors. 


The General Case. For a function f : R” — R”™ we can apply (3.26) to each 
component of f(x). This gives 
fil) — fila) 52(&1) Be (1) \ /b—a 
(3.27) ; = 4 : , 
Fn(b) — fn(a) Sha (g,) ... Sf 2 (En) bn — an 


where all €; € IR” lie on the segment between a and b. The drawback of this 
formula is that the argument €; is different in each row. 
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We cannot hope that (3.26) is true for all functions f : R” — R™. A counter- 
example is fi(a) = cosa, fo(a#) = sinz, a = 0, b = 27. If we are content with 
an inequality, the situation is as follows. 


(3.7) Theorem. Let f : U — R™, U C R” be differentiable at all points of the 
“open” segment (a,b) := {x = a+ (b-—a)t; 0 < t < 1} (these points are 
assumed to be interior points of U) and suppose that in the norm (2.14) 


| f’(a)|| < for allx € (a,b). 
Then, we have 


(3.28) Ilf(6) — f(a)|| < M- ||b— al]. 


Proof. The idea is to consider the function 


(3.29) g(t) := S- ci fi(a + (b—a)t) =c" f(at (b—a)t), 


i=l 


where the coefficients c,...,Cm are arbitrary for the moment. The derivative of 
g(t) is 


g(t) = Di Ci se (a +(b- a)t) (bj — aj) = cT f'(a +(b- a)t) (b—a). 


Application of Theorem III.6.11 now yields 
(3.30) ee? (f(b) — f(a) = 9(1) - 90) = 9'(7) =e F'(O(0— a), 


where € = a + (b— a)r lies on the segment (a, b). We now cleverly choose c = 
f(b) — f(a) to make the expression to the left in (3.30) as large as possible. Then, 
applying the Cauchy-Schwarz inequality on the right of Eq. (3.30), we obtain with 
(2.15) that 

lf) — Fall? < IF — f(@)l| -M-||b— all. 


This gives (3.28) after division by || f(b) — f(a)|| (note that for || f(b) — f(a)|| = 0 
statement (3.28) is obvious). 


The Implicit Function Theorem 


Implicit equations f(x, y) = C' were the central theme of Descartes’s “Géométrie” 
of 1637 (see, for example, Eq. (I.1.18)). Nobody doubted that such equations de- 
fine geometric curves y = y(a), and Leibniz knew how to differentiate such 
functions. However, in the Weierstrass era (see Genocchi-Peano 1884, p. 149- 
151), mathematicians felt a need for a more rigorous proof that guarantees that 
f(x,y) = C is equivalent to y = y(x) in some neighborhood of a point (xo, yo) 
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satisfying f (20, Yo) = C. We then say that the implicit equation f(x,y) = C can 
be solved for y. 

Consider, for example, the circles x? + y? = C and fix a point (20, yo) 
satisfying x2 + yé = C. If yo > 0, we obtain y(x) = VC — x?, for yo < 0 we 
have y(a) = —VC — «?, but for yo = 0 it is impossible to find a function y(z) 
that satisfies 2? + y(x)? = C for all x in a neighborhood of xo. 

In the sequel, we put F(xz,y) = f(x,y) — C and replace the condition 


f(x,y) = C by F(z, y) = 0. 


(3.8) Implicit Function Theorem. Consider a function F : R? — R and a point 
(x0, yo) € R?, and suppose that the partial derivatives OF /Ox and OF /Oy exist 
and are continuous in a neighborhood of (Xo, yo). If 


OF 
(3.31) F (2x0, yo) = 0 and By (0 yo) 4 0, 


then there exist neighborhoods U of xo, V of yo, and a unique function y : U — V 
such that y(xo) = yo and 


(3.32) F(z, y(x)) =0 forall « €U. 
The function y() is differentiable in U and satisfies 


OF /dx(x,y(x)) 


(3.33) OMe) = OF /dy(a,y(a)) | 


Proof. We may assume that OF'/Oy(2o, yo) > 0 (otherwise we work with —F' 
instead of F’). By continuity of OF'/Oy, there exist 6 > 0 and 3 > 0 such that 


OF 
(3.34) By ery) 2 b> 9 for |w—ao| <6 and |y—yo| <6. 
y 
This implies that F'(xo, y) is a monotonically increasing function of y, and, since 
F(xo, yo) = 0, we have F'(xo, yo — 6) < 0 < F (xo, yo + 6). The continuity of F 
implies the existence of 6, > 0 (6, < 4) such that (see Fig. 3.6) 


F(a, yo — 0) <0 < F(x, yo +9) for |x —2o| < 64. 


We now put U = (29 — 61, %0 + 61), V = (yo — 6, yo +5) and apply for each 
x € U Bolzano’s Theorem III.3.5 to F(x, y), considered as a function of y. This 
implies the existence of a function y : U — V satisfying (3.32). The uniqueness 
of y(x) in V follows from the monotonicity of F(a, y) as a function of y. 

We still have to prove that y(x) is differentiable at an arbitrary point x; € U. 
As in the proof of Theorem 3.6, we use the relation 


F(c,y(e)) = Pleas) + FG ule) (@ = 21) + F(er.n) (le) = 0), 
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y 
AOOOB Poe PHD 
5, 
5 $ Py) 
49 
e 
e 
e@ a 
e Xx 
4 (Xq Yo) 
e 
e 
e 3, 


FIGURE3.6. Proof of the Implicit Function Theorem 


where y; = y(21), € is between x and 21, 7 is between y(x) and y;. From (3.32) 
and (3.34), we thus obtain 


_ dF /ae(é,y(e)) 
OF /Oy(a1,0) 
The function OF'/Ox is continuous and thus bounded for |” — xo| < 6, and 
ly — yo| < 6, say by M. This, together with (3.34), implies |y(x)| < M/~, 
and the continuity of y(a) is a consequence of (3.35). Once the continuity of y(a) 
is proved, :y() is seen to be continuous at x1, so that y(«) is differentiable at 7}. 
Formula (3.33) is obtained by computing lim, _.., y(2). 


(3.35) y(x) — yr. = v(x) (x — 21), p(x) = 


Remark. If the differentiability of the function y(a) is established, Eq. (3.33) is 
obtained by differentiating the identity F’ (x, y(x)) = 0. This procedure is called 
implicit differentiation and has been used already at the end of Sect. II.1. 


Differentiation of Integrals with Respect to Parameters 


We now wish to know whether an integral containing a parameter p (see Eq. (2.23)) 
is a differentiable function of p and if so, whether its derivative can be computed 
by exchanging integration and differentiation, i.e., by integrating Of /Op. 


(3.9) Example. The integral 


nm /2 an/2 _ 
(3.36) | e** cosa dx = ———* 
0 a +1 


is best computed by taking the real part of i * e(a+i)e dy. If we differentiate 
both sides of (3.36) several times with respect to the parameter a, we obtain 


a formula that would be much more difficult to obtain by other means. 
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(3.10) Counterexample. Looking at Fig. III.5.5a, we observe that the integral of 
fn(x) behaves like C'/n for n — oo. This suggests the definition 


tip. ep? 
(1+ a?/p?)? — (p? + 2®)? 
and f (0,0) = 0. Then, 


(3.39) n= fr f(x,p)d =a PTT 7 


(3.38) f(x,p) = for p’+27>0 


has the derivative F’(0) = 1/2. On the other hand, lim,—.9 35 oL (a, p) is identically 
zero (see Fig. 3.7). 


FIGURE 3.7. The function (3.38) (stereogram) 


(3.11) Theorem. Consider a function f : [a,b] x [c,d] — R and suppose that the 
partial derivative of (x, p) exists and is continuous on |a, b] x |c, d]. If the integral 


b 
(3.40) F(p) = f(a, p) dx 


exists for all p € |c, dl, then F'(p) is differentiable in (c, d) with derivative 
b 
0 
Gl) Ftp) = fF (e.po)de 


Proof. We consider the difference 


(3.42) Fo) - Fo) = [ “(t=.p) = fle,p0)) de. 


To the term on the right, we apply Lagrange’s Theorem II.6.11, which gives 


b 
F() - Fo) = | s(n) die pas). 
a 
y(p) 
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Here, 7 depends on x and lies between p and po. Since Of /Op is continuous on 
the compact set [a, b] x [c, d], we see as in the proof of Theorem 2.12 that y(p) is 
continuous at po and the statement follows from Eq. (III.6.6). 


Exercises 


3.1 Consider the function f : R? —> R (see Fig. 3.8a), 


3 3 
ejpt+ 2x5 il 2 
s=— 1ifaj+r>0 
f(t1,%2) = 4 af +23 
0 if 7; =x = 0. 


Is f continuous? Does it have directional derivatives at the origin? Are the 
partial derivatives Of /Ox and Of /Ox2 continuous? Is f differentiable? 


FIGURE 3.8. Stereograms for Exercises 3.1, 3.2, and 3.3 


3.2 The same questions as before for the function f (11,72) = ./|x1%2| (see 
Fig. 3.8c; the Sydney Opera House). 


3.3. Show that f : R? —> R defined by 


1 
sin( +——,)_ if 2? +22 >0 
“1X2 in( a) Ly Xv 
0 if7, = 72 =0 


f(x1,%2) = 


(see Fig. 3.8b) is everywhere differentiable, but that the partial derivatives are 
not continuous at the origin. This function is a bidimensional analog of the 
function of Fig. IT.6.1. 
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FIGURE 3.9. Bernoulli’s lemniscate and Cassinian ovals 


3.4 Fora given constant a define f : R? —> R by 
= |x a9|° if L1X2 x 0 
F(@1, 22) = { 0 if ra = 0. 


Determine the values of the parameter a for which (a) f is continuous and 
(b) for which f is differentiable. 


3.5 Show that for the function f : R” — R defined by f(x) = x? Ax, where A 
is a constant n x n matrix, the derivative is given by f’(x) = 27(A + A’) 
(in case of trouble, write explicitly the components of f for n = 2). 

3.6 Let V(x, y) be a differentiable function and 


Wir, y) := V(rcosy,rsiny). 
Apply the chain rule to show that 
(a2) +(3,) = (Ge) tae) - 


3.7 We call a differentiable function f : R” — R homogeneous of degree p, if 


(3.43) f(ax) = a? f(x) for a>0, ce R”. 
Show that the functions tan(x1/x2), \/2x? + 373 + 4x2, and x} —5aya23 + 


x?x3 are homogeneous (of which degree?) and show that a homogeneous 


function satisfies Euler’s identity 
) ) 
nigh (e) +e. t tne (e) = pf(z). 
Hint. Differentiate (3.43) with respect to a. 


3.8 Study the functions y(x) defined by the implicit equation 
(3.44) (x? + y?)? — 2x? + 2y? = C, 


which yields, for C = 0, the famous “Jemniscate” of Jac. Bernoulli (1694, 
see Fig. 3.9). Find the locus of points at which OF'/Oy = 0, i-e., the points 
at which the Implicit Function Theorem does not apply. Also find the locus 
of maximal values of the solutions of (3.44), i.e., points at which y’(x) = 0, 
and show that they lie on a circle. 
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3.9 Same question for the “folium cartesi1” 
g+y® = 3zry 


(see Fig. 4.2 below). 


3.10 Compute the points x € R?, where the columns of the matrix f’(a) of (3.7) 
are vectors with the same direction (i.e., det f’(a~) = 0). These points are 


6699 


marked by “o” and “x”’, respectively, in Fig. 3.2. 
Answer. ((k+1+3/4)m, (kK—-1+1/4)z) and ((k+1+3/4)z, (k-1—3/4)) 
for k,l € Z. 


3.11 Which of the following two integrals do you think is easier to evaluate: 


1 1 
| (In x) dx or | x (In x)” dx? 
0 ) 


Well, the second one can be differentiated with respect to the parameter a. 
Do this (after justification) and compute the two integrals. 


3.12 Given that 


a d 
i ene ayy, Sy, eC, for a>1l, 
0 


a—cosx az —1 
verify that 


i da _ 5V6r sia [ da _ v5 
o (5—cosx)2 288 9 (6—4cosz)3 1000 © 


3.13 Show that 
i: log (1 + ax) 
0 


1 2 
= . > i, 
aA dx 5 arctan(a) - log(1 + a“) for a>0 


Hint. Differentiate the integral with respect to a, after justification. 


3.14 Show that 
~ sing T 
i dx = =. 
0 x 2 


Hint. Show, with the help of Definition III.8.1, Theorems 3.11 and III.6.18, 
and Exercise II.4.2.h, that 


—1 
1+a2’ 


F(a) = | eT de = Fay=- [ e~° sin a da = 
0 0 


ax 


if a > 0. Finally, by modifying the proof of Example III.8.5, show that F(a) 
is one-sided continuous at a = 0+. 
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IV.4 Higher Derivatives and Taylor Series 


Now it is easy to see that differentials of this kind keep the same value if one 

exchanges the order of differentiation with respect to the several variables. 

(Cauchy 1823, Résumé, p. 76) 

For the moment we consider functions f(x,y) of two variables. Partial deriva- 

tives, such as Of /Ox, are again functions of two variables, and we can repeatedly 
compute their partial derivatives as indicated in the following diagram: 


ea 2 i oe Ws 
ve Ox Ox? 
Oy Oy 

af oe ar 2 as & 

ae ——- => =—_— ———- 

Oy OxOy OyOu 

oy Oy 

of oe af 2 of oe 

—— ———— a ————- 

Oy? OxOy? = Oy? Ox 


The question is whether these derivatives depend on the order of differentiation. 


(4.1) Example. Following Euler (1734, Comm. Acad. Petrop., vol. VII, p. 177), 


we consider the function f(z, y) = ./x? + ny? and compute partial derivatives 
(for x? + ny? > 0): 


Ta, ee 
age x? + ny?’ OyOu a (x2 + ny?)8/2’ 
Se ee ee 
Oy ae a + ny? Oxdy a (#2 + ny?)3/2" 


Euler then announces (see also Euler 1755, 8226) that in general, 


a? f a? f 


This, however, is not true without any further assumptions, as can be seen from 
the following counterexample. 


(4.2) Counterexample. H.A. Schwarz (1873) gave a first rather complicated 
counterexample for (4.1) (see Exercise 4.1). An easier counterexample, due to 
Peano (1884, “Annotazione N. 103”), is obtained by considering 
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(4.2) f(z,y) =zyg(2,y), 


where g(x, y) is bounded (not necessarily continuous) in a neighborhood of the 
origin. For this function we have 


OF 06,4) = tim Lew) = FO) 


Da jim - = lim y g(x,y). 


The derivative of this expression with respect to y is 


0? f 


(4.3) 


provided that this limit exists. Similarly, we have 


2 
(4.4) Dndy (0,0) = lim (lin lim g(x, y)). 


We only have to choose a function g(x, y) for which the limits in (4.3) and (4.4) 
are different. This is the case for 


ao yr 2 2 
(4.5) g(x,y) = Pay if a +y* >0, 


for which limz_.9 g(x, y) = —1 for all y ¥ 0 and lim,_,o g(x, y) = +1 for all 
x # 0. Hence, the mixed partial derivatives 


Oxd 
are different for the function defined by (4.2) and (4.5). 


2 
(0,0)=—-1 and 2p yes 
wOy 


(4.3) Theorem. Consider a function f : R? — R HOF a the partial derivatives 
af of 2 


Bn? Dy’ inte exist in a neighborhood of (x, yo) 
f 


(x0, Yo). Then, ok exists at (%, yo) and we have 


sa eee, 
axdy Zo, Yo ayn 0, Yo). 


Proof. The idea is to consider a small rect- 
angle with sides h and k. The values of a toi oy 11 
f at the vertices are denoted by foo, fo1, Yo ce 


fio, and f1,. The partial derivatives are ap- 
proximately given by k 
2 a F 
FE (ay, yo) a LO M0, Yo OK a oO 
(4.6) Xo be) 
Of (xo. y 2 k) & fir = for Fou 
Dz STO ¥o ——5 
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Of — S£(ao,yo +k) — $4(20,y0) fir — for — fio + foo 


OD) Oydx k ~ h-k : 


and similarly, 


 f _ (woth, yo) — 54 (xo, Yo) _ fu — fro — for + foo 


4.8 

oe) ardy h keh 

The expressions to the right of (4.7) and (4.8) are identical (Euler, “... huius 
theorematis veritatem exercitati facile perspiciant ...”) and the statement of the 


theorem seems plausible. 

In order to make the proof rigorous, we should replace the differences in (4.6) 
by Lagrange’s Theorem III.6.11. There is, however, a slight difficulty, because the 
intermediate points € will not be the same for the two differences. To overcome 
this difficulty, we consider the function 


(4.9) g(x) = f(t, yo +k) — f(x, yo), 
apply Lagrange’s Theorem in the form g(zo + h) — g(xo) = hg'(€), and obtain 


fir — fio — for + foo = a(S (€,yo + k) — a (£,0)); 


where €& lies between xp and xq + h. Next, we apply Lagrange’s Theorem to 
ot (¢ ,y), considered this time as a function of y, and obtain 


fumfio— fort foo Of 


oe h-k ~ AyOx 


(€,7) 


(7 is between yo and yo + k). 


Because of the continuity of ae at (Xo, yo), it follows from (4.10) that for 
every € > 0, there exists a 6 > 0 such that for h? + k? < 6°, 


fiu-fio- fort foo Of 


h-k Dyou 70? Ho) <é. 


For k — 0 the differences (f11 — fio)/k and (fo1 — foo) /& tend to (xo +h, yo) 


and 3 (x0, Yo), respectively. Hence, we have, for |h| < 6, 


Hes (20 + h, yo) — ~ (xo, yo)) 7 Hh (oo, 0) <e. 


This, however, means that 


1 /Of of a OF 
lim nr (ro + h, yo) — ai = (20, ¥o)) = Dyaa (7 40) 


and the statement of the theorem is established. 
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This theorem applied several times allows us to exchange higher order deriva- 
tives. For example, 


SIPURA AE a RC ga 
Ox Oy Ox Ox Oy’ ~~ Ox Ox Oy Ox Oy” —— Ox Ax Ox Oy Oy? 
S—$—S 


y 
Sa 
Ne a Xf 4 
It also applies to functions of more than two variables. Indeed, we can always 
exchange two partial derivatives at a time, the other variables being kept constant. 


Taylor Series for Two Variables 


Our next aim is to extend the Taylor series to functions of two variables. The idea 
(Cauchy 1829, p. 244) is to reduce the problem to one variable by connecting the 
points (xo, yo) and (ao +h, yo+k) by a straight line. We thus consider the function 


(4.11) g(t) := f (xo + th, yo + tk) 


and apply Eq. (III.7.18) (Taylor series for one variable). For this we have to com- 
pute the derivatives of g(t). If f(a, y) is differentiable sufficiently often, the chain 
rule yields 


0 0 
(4.12) g(t) = OF 3. + th, yo + tk) h+ OF gs + th, yo + tk) k 
Ox Oy 
and a further differentiation gives 
of O° f Of Of 
iN pean 63 f 
(4.13) g(t) = Fat | )hh + —— Bydz (hk + —— andy ()kA+ By?! )kk, 


where the omitted argument of the partial derivatives of f is (vo + th, yo + tk). 
The two central terms in (4.13) are equal by Theorem 4.3 (further differentiation 
causes the appearance of the binomial coefficients). Inserting the above derivatives 
of g(t) into, for example, 


g(1) = 9(0) +. 9'(0) + 5 9"(0) +2 9'"() 


(with 0 < @ < 1), yields 


f(to +h, yo + k) = f(®0, yo) + a (eo, yo)h + Fe (co.uk 


1/07 0 ge? 
5 5(S3 oS tao, yo)h? + 25 —— at (xo, yo)hk + sr (0 yo)k”) 
Lory ‘ af ” 
Cel) + 2(S5(Emh® +3555 5 (ehh 


a 2 PF ie aps 
+35 Ga Emak? + Fa (Enk), 
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where € = x9 + 0h and 7 = yo + OK are intermediate points. It is of course also 
possible to use Theorem HI.7.13 with the remainder in integral form. 


(4.4) Example. We consider the function f(x,y) = e~*—¥” (see also Example 
3.1), whose partial derivatives are 


Of Sis —22-y? Of i —a?—y? 

FE (x,y) = —2ne- 8, a 

0? Perea ye) 0? open? 
Fer ew) = (da? = ae", Ley) = (ay? — 2)", 
Of Of ai 

eas eee =Agye-* -¥. 

Dndy (x,y) Dyn (x,y) = 4aye 


If we neglect the remainder in (4.14) and put 7 = 0.9, yo = 1.2, we obtain the 
quadratic approximation 


f(0.9+h,1.2+k) © e~? (1 —1.8h — 2.4k + 0.62h? + 4.32hk + 1.88K7). 


Fig. 4.1 compares this approximation to the function f(x, y). The domain of the 
graph is restricted to -l1 <x <2,-l<y<2. 


FIGURE 4.1. Taylor’s approximation of second order for f(x, y) = ene? 
Taylor Series for n Variables 
We now extend our formulas to functions 
f:R”® —R”, 
vom ; 
where f(x) = (fi(x),...,fm(x)) is composed of m real functions of « € R”. 


We fix xo € IR”, h € R” and apply the results of Sect. III.7 to g(t) := fi(ao+th). 
This yields, for example, 
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n a : n n a2 : 
falta +h) = fiero) + 3° (wo)hy + 57 Yo 50) gh 
“ j=1 k=1 


3! es OX jOLROXe 


(4.15) f hjhghe. 


We can go even further, and write (formally, without considering convergence) 


n 


~ 0 4 
fi(to +h) = fi(xo) pir Soe made, Oey te 


“fi=ljo=l jg 


These formulas are rather cumbersome and call for a more compact notation, 
which, in the words of Dieudonné, “does away with hordes of indices”. The lin- 
ear term in (4.15) is just the ith element of the product f’(a9)h (Jacobian matrix 
with vector h). In order to simplify the quadratic term, we consider the bilinear 
mapping f" (x) : R” x R” — R™, whose ith component, when applied to a pair 
of vectors u and v, is defined by 


(4.16) ( f"(a)(u,v)) = 


Hence, the quadratic term in (4.15) is the ith element of the vector f”’(xo)(h, h). 
We can continue by interpreting higher derivatives as multilinear mappings. For 
example, f’’(x) : R" x R” x R” — R” is defined by 


(4.17) (F(a) (u,v, w) )). = eee FeBe Ta wsnnwe 


j=1 k=1 €=1 


& 


With this notation, formula (4.15) becomes 
1 
(4.18) f(zo +h) = f(ao) + f'(xo)h + rT f’(xo)(h, h) + Re. 


For the remainder R3 we may not write Rs = (1/3!) f’"(ao + 6h)(h, h, h), be- 
cause the intermediate points xp + 6;h in (4.15) might be different for each com- 
ponent. However, we can use the integral representation (Theorem III.7.13) to 
obtain 


ly _ 4)2 
(4.19) ka= | aS) f'" (ao + th)(h, h, h) dt. 
0 


(4.5) Remark. For a vector-valued function g(t) = (gi(t),... ,gm(t))” we use 
the notation 


(4.20) [ g(t) dt = @ gi(t) dt,..., ri din(t) at). 


In what follows, we shall use the estimate 
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1 a 
(4.21) | [sera] sf tateonar, 


which is obtained by considering Riemann sums and using the triangle inequality 


as follows: || 7, 9(&)di|| < 3°; llg(&)||6:- 


Estimation of the Remainder. Suppose we want to estimate the remainder R3 of 
(4.19). In view of (4.21), we have to estimate the expression || f(x) (h, h, h)||. For 
the Euclidean norm this can be achieved by repeated application of the Cauchy- 
Schwarz inequality. Denoting the expression of Eq. (4.17) by ai, we have 


n n 
iS bysiy a < (50 2,) lla, 

j=l j=l 
n 

bis = > cisete, bi a < (Soetu)I 
k=1 
n n 

Cijk t= S- dijnewe, Cok < (deans) I|wll?, 
t=1 


&° fi(x) 


where dijze = de70n. On Inserting c?,, from the last inequality into the preced- 


ijk 
ing one, then be, into the first inequality, yields 


n n n 
a < (SOD. Bree) ello lhe. 


aS 
Il 
un 
> 
Il 
un 
io 
Il 
un 


Computing >>, a? and its square root, we obtain 
(4.22) IF" (@) (u,v, w)|| < M(@) |lull [loll [eel 


where 


a 0 fi . 
(4.23) M(x) = Z (space) 


(4.6) Lemma. Let f : R” — R” be three times continuously differentiable; then 
the remainder R3 in Eq. (4.18) satisfies 


|R3|| < —-— sup M(axo + th), 
t€[0,1] 


where M(x) is given by (4.23). 
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Proof. Applying the estimate (4.21) to (4.19) yields 


‘a-t) 
[Ral] < [PS Yeo + eny(t. n,n) at 
0) . 


Because of (4.22), the expression || f’” (vo + th)(h,h,h)|| is at most equal to 
SUP;¢[0,1] M(xo + th)||h||° and the conclusion follows from Eq. (III.5.18). 


Maximum and Minimum Problems 


Our next aim is to extend the results of Sect. II.2 concerning necessary and suf- 
ficient conditions for a local maximum (or minimum) to functions z = f(x,y) 
of two variables. We have already seen in Sect. IV.3 (geometrical interpretation of 
the gradient) that grad f (20, yo) = 0, i-e., 


6) 6) 
(4.24) FE (a, yo) = 0, Oh as ai = 0, 
xv Oy 


is a necessary condition for a maximum (or minimum). Points satisfying (4.24) 
are called stationary points of f(x,y). 

In a sufficiently small neighborhood of a stationary point (20, yo) (ie., if 
|x — xo| and |y — yo| are small), the remainder term in (4.14) may be neglected 
and the condition 


2 2 
— (0, yo) hk + Fo (xo, vo) >0 
guarantees that f(ao +h, yotk) > f (20, yo) (if the function is only twice contin- 
uously differentiable, we take one term fewer in the Taylor series and exploit the 
continuity of the second partial derivatives). Therefore, we have a local minimum, 
if (4.25) holds for all (h,k) A (0,0). If the expression in Eq. (4.25) is negative 
for all (h,k) A (0,0), we have a local maximum. In the case where (4.25) takes 
positive and negative values depending on the choice of (h, k), the function has a 
saddle point at (29, yo), i-e., there are directions in which the function increases 
and other directions in which it decreases. 

In order to check whether a quadratic form Ah? + 2Bhk + Ck? is positive 
for all (h,k) 4 (0,0), we put \ = h/k and consider A\? + 2BX + C. This 
polynomial takes only positive values if A > 0 and AC — B? > 0, and only 
negative values if A < 0 and AC — B? > 0. We have thus proved the following 
result, which is from the very first paper published by the young Lagrange. 


(4.25) 


(4.7) Theorem (Lagrange 1759). Let f : R? — R be twice continuously differen- 
tiable and suppose that (4.24) is satisfied. 


a) The point (x, yo) is a local minimum, if, at (xo, yo), 


ts) Of oF af \2 
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b) = The point (x0, yo) is a local maximum, if, at (x0, yo), 


of Off (Pfr? 


c) Inthe case where 


sae Ue ley eed J <o 


G28) ar Te CP 


at (0, yo), then this point is a saddle point. 


(4.8) Example. The function 
(4.29) f(z,y) = 2? +y° —3ay 


creates the famous “folium cartesii” (letter of Descartes to Mersenne, Aug. 23, 
1638). Its level curves are plotted in Fig. 4.2. Computing the partial derivatives 

) 3) 
clin, y) = 3a" — By, seu) = 3y’ — 3a, 
we see that the function (4.29) has two stationary points, namely (0,0) and (1, 1). 
Checking the sufficient conditions of Theorem 4.7 shows that (0,0) is a saddle 
point and that (1, 1) is a local minimum (see also Fig. 4.2). 


FIGURE 4.2. Level curves for the Cartesian Folium (4.29) 


Extension to n Variables. Consider real-valued functions z = f(21,...,2n) 
with more than two variables. We have seen in Sect. IV.3 that a necessary condition 
for a local extremum (maximum or minimum) at 7p € R” is 
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(4.30) grad f (xo) = 0. 
To obtain sufficient conditions, we must study the quadratic term in (4.15). With 
h = (hy,...,hn)*, this term can be written as (h” H (x )h)/2, where 

a? a? 

Be(®)  Sgdag () 
(4.31) A(x) = 

a? O? f 

Taber (®) °° Bag (2) 


n 


is the so-called Hessian matrix (Hesse 1857, Crelle J. f. Math., vol. 54, p. 251). If 
the assumptions of Theorem 4.3 are satisfied, this matrix is symmetric. 

If, in addition to (4.30), the matrix (4.31) is “positive definite” at xo, ie., 
h? H(xo)h > 0 for all h 4 0, then the point 9 is a local minimum. A stationary 
point xo is a local maximum if H (xo) is “negative definite”, ie., h7 H(xo)h < 0 
for all h # 0. For the verification of positive (negative) definiteness of a matrix of 
dimension > 3 we refer to the standard literature on Linear Algebra, e.g., Halmos 
(1958, p. 141, 153). 


Conditional Minimum (Lagrange Multiplier) 


Problem. Find a local maximum (or minimum) of a function f(x, y) subject to a 
constraint g(x, y) = 0. If we denote the level set of g by A = {(x,y) | g(a, y) = 
O}, this means that we have to find (a9, yo) € A such that f(x,y) < f(xo, yo) for 
(x,y) € A. 


A direct approach would be to solve the equation g(x,y) = 0 for y in or- 
der to obtain y = G(x) (see the Implicit Function Theorem 3.8) and to look for 
an extremum of F(x) = f(x,G(x)). More generally, we could try to find a pa- 
rameterization (a(t), y(t)) of the level curve A and consider the function F(t) = 
f (a(t), y(t)). A necessary condition for an extremum at (29, yo) = (x(to), y(to)) 
is F’ (to) = 0, ie., 


) ) 
(4.32) FE (x9, yo)2" (to) + 5" (a0, yoy (to) = 0. 

Ox Oy 
This is an equation for to and sorts out possible candidates for the solution. How- 
ever, this approach is often impracticable, because a suitable parameterization is 
difficult to obtain. 


Lagrange’s Idea (Lagrange 1788, premiére partie, Sect. IV, §1, Oeuvres, vol. 11, 
p. 78). We observe from (4.32) that grad f (xo, yo) is orthogonal to the tangent 
vector (x' (to), y’(to)) of the level curve A. Hence (see Sect. IV.3), at a local ex- 
tremum, the vectors grad f(2o, yo) and grad g(o, yo) have the same direction 
(see Fig. 4.3), and we get the necessary condition 


(4.33) grad f (0, yo) = Agrad g(xo, yo), g(xo, yo) = 9 
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(if grad f(xo, yo) # 0). The parameter A is called a Lagrange multiplier. Equa- 
tions (4.33) represent three conditions for the three parameters xo, yo, A. With the 
function 


(4.34) L(x,y,d) = f(@,y) — Ag(@y); 
condition (4.33) can be expressed elegantly as 


(4.35) grad £(x9, yo, A) = 0. 


aad f onst 


SS 


FIGURE4.3. Conditional maximum for f(x,y) = «+2y, p=3 


grad g~ 


(4.9) Example. Let positive numbers a, b and p > 1 be given. Compute the maxi- 
mum of 


(4.36) f(z, y) = ax + by 


in the region x > 0, y > 0, subject to the constraint g(x,y) = a? + y? -1=0 
(see Fig. 4.3). Using Lagrange’s idea, we consider the function L(x, y, A) = aw + 
by — A(aP + y? — 1), and the necessary condition (4.35) becomes 


(4.37) a—prg*=0, b-prdyh'=0, ah tyh=1. 


The first two relations yield 


(4.38) vas (ay, ee () 


and by inserting these values into the last relation of (4.37), we obtain 


(4.39) cae a (x)' = 
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where 


Te oll 
(4.40) pees: be “hea ea, 
p-l | 
Equation (4.39) allows us to compute A. Inserting the result into (4.38), we finally 
obtain the solution 


qi/P pa/P 


(4.41) ro = (at 4 oy? Yo = (at 4 oy? 


which by Fig. 4.3 can be seen to yield the desired maximum. 


Hélder’s Inequality (Hélder 1889). Let €,7 and p > 1 be positive numbers. Then 
g ui 


= fp} a a PY 
(€P +P)” (& +n) '/” 
satisfy x? + y? = 1, and it follows from Example 4.9 that 


b 4+ 64 
(&? + qP)1/P (at + b9)'/? 
We thus obtain 4 
af + bn < (EP +P)? (at +b)", 
where p and q are related by (4.40). By induction on n, this inequality can be 
generalized to 


(4.42) draw < (Soe)? (Sou) 
i=l 


i=1 i=l 


for positive numbers x; and y;. This is the so-called Hélder inequality. For p = 
q = 2, it reduces to the Cauchy-Schwarz inequality (1.5). 

With (4.42), we can prove the triangle inequality for the norm ||z/|, of 
Eq. (1.9). Indeed, for two vectors x, y € R”, we have 


n n n 
llc + ylIB = So |e + yal? < do [aad - ls + PP? + SS Iyil «lee toys. 
i=l w=1 i=l 


We apply (4.42) to the two sums on the right side of this inequality and obtain 


2 F a Lp 7 @=ay\ 9 
Doled- lectus? < (Sola) (Soles + wl) 
i=l i=l i=l 


= |lellp- la + yl. 


This yields ||x + y||? < ((lxllp + llyllp) - la + yll2~+, and hence the triangle 
inequality ||x + yllp < |la'l|p + llyllp- 
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Exercises 


4.1 


4.2 


4.3 


44 


4.5 


4.6 


(H.A. Schwarz 1873). Show that for 


x arctan 4 — y? arctan 2 if xy £0 
flay) = oS 
0 if cy = 0, 
oe ng Of, Ott 
the second partial derivatives at the origin are different: ———— . 
OxOy © OyOx 


Show that Taylor’s formula (4.14) only holds if all partial derivatives involved 
are continuous. This is in contrast to the case of one variable (see, e.g., Theo- 
rem III.6.11). The following counterexample by Peano (1884, “Annotazione 
N. 109”), 

Ly 


f(x,y) = ae ys 


0 otherwise, 


ty = yo = —a, h = k = a+, shows that Eq. (4.14), written with the 
first-order error term 


if x? +y? £0 


Of Of 
h k)= = h+— k 
f (xo + »Yo + ) F(xo, yo) + (én) + 5,60) ’ 
where € = x9 + 6h and 7 = yo + 9k are intermediate points, might be wrong. 
This corrected an error in Serret’s book. 


Analyze for Example 4.4 the intersections of the graph of f(a, y) with that 
of its Taylor approximation of order 2 in the neighborhood of (xo, yo) and 
explain the star-shaped curves (see Fig. 4.1). Why do you think the authors 
chose the point (0.9, 1.2) for their figure and not, as in Fig.3.1, the point 
(0.8, 1.0)? 

Hint. Use the error formula in (4.14). 

Let f : R? — R bea differentiable function that satisfies 


grad f(x) = g(x) +2", 
where g : R? — R. Show that f is constant on the circle {x € R?; ||z|| = 
r}. 
Show that U = (a? + y* + z?)~1/? satisfies the differential equation of 
Laplace 
OU OF. PU 


pe fe By? 2 > 0, 
jee Gye Oe pas Teer ae 


Find the stationary points of the function 


f(a,y) = (a? + y?)’ — 8ay 


and study the level curves f(x,y) = Const in their neighborhood. (Any 
similarity of these curves with curves already seen is intentional). 


4.7 


4.8 


4.9 
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Find the maximum value of 3/xyz subject to (x + y + z)/3 = 1. What 
conclusion can be drawn from this result? (We have already seen in Example 
4.9 that the computation of a conditional maximum is an excellent tool for 
obtaining interesting inequalities.) 
Find the maxima or minima of x? + y? + z? subject to the conditions 
—+—+4—=1 d = : 
a1 9 1 35 Ree ees ramet 


Remark. If there are two conditions to satisfy, you will have to introduce two 
Lagrange multipliers. 


Let 
0 v2 


be the matrix of the example in Sect. [V.2. Find the maximum of the function 
f(x) = ||Azl|3 subject to ||z|/3 — 1 = 0. The result is the value of || All2, 
defined in Eq. (2.14). 


ree 1 ) 


4.10 Show that the function f : R? — R given by 


f(a,y) = (y— 2° )(y — 22) 


has the origin as a stationary point, but not as a local minimum. Neverthe- 
less, on all straight lines through the origin, the function has a local min- 
imum. With this counterexample, Peano (1884, “Annotazioni N. 133-136”) 
corrected another error in Serret’s book. Such irreverent criticism of the work 
of the greatest French mathematicians by a 25-year-old Italian “nobody” did 
not delight everybody (see, e.g., Peano’s Opere, p. 40-46). 
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IV.5 Multiple Integrals 


We know that the evaluation or even only the reduction of multiple integrals 
generally presents very considerable difficulties . . . 
(Dirichlet 1839, Werke, vol. I, p.377) 
The Riemann integral for a function of one variable (Sect. III.5) represents the 
area between the function and the x-axis. We shall extend this concept to func- 
tions f : A  R (where A C R?) of two variables in such a way that the integral 
represents the volume between the surface z = f(x, y) and the (a, y)-plane. Many 
definitions and results of Sect. III.5 can be extended straightforwardly. However, 
additional technical difficulties occur, because domains in R? are often more com- 
plicated than those in R (see Fig. 5.1). The extension to functions of more than two 
variables is then more or less straightforward. 


‘ b 
rectangle nonconvex nonconnected 
FIGURE5.1. Possible domains in R? 


Double Integrals over a Rectangle 


We begin by considering functions f : J > R, whose domain I = [a, }] x [c, d] = 
{(x,y)|a<a<b,e¢<y < d} is aclosed and bounded rectangle in R?, and we 
assume that the function is bounded, i.e., that 


(5.1) IM>0 Via,y)el lf(z,y)| <M. 
We consider divisions 
(5.2) Dz =o, iy igen f of [a, }], 

Dy = {Yo, 915.1049} of [c, d], 


where a = 2% < 21 <... <n, = bandc=y < y <... < Ym = d, denote 
the small rectangle displayed in Fig.5.2 by [;; = [wi-1, @:] x [yj-1, yj], and its 
area by 


(5.3) (Tig) = (@i — ©i-1) (yj — yj-1)- 
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Ym= 


A Nee Xi X; x. = 0 


FIGURE5S.2. Division of a rectangle together with J;; 


Using the notation 


(5.4) Tes = inf Fas Y), Fi; — sup Fe y); 
(x,y) Elis (x,y) Eliz 


we then define lower and upper sums by 


(5.5) s(Dzx Dy)= >> _ figulig),  S(Dz x Dy) =a +4 WL, 
es eas 


If we add points to the division D, (or to D,,), then the lower sum does not de- 
crease and the upper sum does not increase (cf. Lemma III.5.1). Futhermore, a 
lower sum can never be larger than an upper sum (Lemma III.5.2). Hence, the 
following definition makes sense. 


(5.1) Definition. Let f : I — R satisfy (5.1). If 


(5.6) sup s(D, x D,)= inf S(D, x D,), 
oo y) ae ( y) 


then f(x,y) is integrable on I and the value (5.6) is denoted by 


(5.7) | fle,y)d(x,y) or ij  #(e.9) d(x. 4). 


As a consequence of this definition and of the aforementioned properties, we 
have that f : J — R is integrable, if and only if (see Theorem III.5.4) 


(5.8) Ve>0 A(Dz,Dy) S(Dz x Dy) — 8(Dz x Dy) <€ 


The theorem of Du Bois-Reymond (Theorem III.5.8) also has its analog. 
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(5.2) Theorem. Let D5 be the set of all pairs of divisions (Dz, Dy) such that 
max;(x;—%;-1) < 6 and max;(y; — y;-1) < 6. A function f : I — R satisfying 
(5.1) is integrable if and only if 


Ve>0 4d6>0 V(Dz,Dy) € Ds S(Dz x Dy) — s(Dz x Dy) <¢. 


Proof, For an € > 0 let (Dz, Dy) be given by (5.8). This induces a grid whose 
length (in the interior of [a, b] x [c,d]) is L = (n —1)(d—c)+(m—1)(b— a) 
(see Fig. 5.3, left picture). We then take an arbitrary division (D,., Dy) € Ds, set 
A = S(Dz x Dy) — s(Dz x Dy), and put D!, = Dz U Dz, Di, = Dy U Dy, 
A’ = S(Di, x Di) — s(Di, x Di). We then get, exactly as in Eq. (III.5.10) (see 
Fig. 5.3, right picture), 

A<A’+L-6-2M. 


The conclusion is now the same as in the proof of Theorem III.5.8. 


FIGURES.3. Division Dz x Dy (left), division D’, x Dy, elements I;; of Di, x Di, that 
intersect D, x Dy (right) 


Let &1,...,&, be such that a;_, < € < a; and 7,...,%m be such that 
Yj—-1 <5 < y;- It then follows from Theorem 5.2 that 


69) | feme—eey—w-) - ff few) ate.u) Ke, 


i=1 j=1 
provided that max, (x;—2;_1) < 6 and max,(y;—y,-1) < 6. This is true because 
the sum and the integral in (5.9) both lie between s(D, x D,) and S(Dz x Dy). 


Iterated Integrals. The inner sum in Eq. (5.9), namely ae f (Ei. (yy — 
y;—-1), is a Riemann sum for the function f(€;, y). Assuming this function to be 
integrable (in the sense of Definition II.5.3) for all 2, we obtain from (5.9) that 


Y f 1eey (xi — Zi-1) - ff senate] <e 


(5.10) 
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Here, we are again confronted with a Riemann sum, this time for the function 


TR ibe f(a, y) dy. The estimate (5.10) expresses the fact that the Riemann sums 
converge to ff, f(x,y) d(x, y) if max;(2; — x;-1) — 0. Hence, we have (Exer- 
cise 5.1) 


(5.11) [ (ico an) ax = ff ew) dey) 


and have proved the following result. 


(5.3) Theorem (Stolz 1886, p.93). Let f : I — R be integrable and assume that 
for each x € [a,b] the function y > f(x,y) is integrable on [c,d]. Then, the 
function x > ie f (a, y) dy is integrable on |a, b| and identity (5.11) holds. 


Consequently, the computation of a double integral is reduced to the compu- 
tation of two simple (iterated) integrals and the techniques developed in Sects. II.4, 
II.5, and III.5 can be applied. By symmetry, we also have 


(5.12) [Cf tener) ay = ff seu) ale), 


provided that f : I — R is integrable and that the function z + f(x,y) is 
integrable on [a, b] for each y € [c, d]. The two identities (5.11) and (5.12) together 
show that the iterated integrals are independent of the order of integration (under 
the stated assumptions). 


Counterexamples. We shall show that the existence of one of the integrals in 
(5.11) does not necessarily imply the existence of the other. 


FIGURES.4a. Nonintegrable function FIGURES.4b. Integrable function 


1) Let f : [0,1] x [0, 1] — R be defined by (Fig. 5.4a) 


1 if (v,y) = (7S, +) with integers n, k, ¢, 


65.13)  f(t,y) = { 


0 else. 
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For a fixed x € [0,1] there are only a finite number of points with f(x,y) 4 0. 


Hence, is f(x,y) dy = 0 and the iterated integral to the left of (5.11) exists. 
However, every rectangle [7;_1, x;] x [y;-1, yj] contains points with f(x,y) =1 
and points with f(x,y) = 0. Consequently, s(D, x Dy) = 0 and S(D, x Dy) = 1 
for all divisions and the integral to the right of (5.11) does not exist. 


2) The function (Fig. 5.4b) 
1 if («=0orx=1)andyEQ 
(5.14) f(z.y)=41 if (y=Oory=1)andzeQ 
0 else 


is integrable, because the points with f(x, y) 4 0 form a set that can be neglected 
(see below). But, for x = 0 or = 1, the function y +> f(x, y) is the Dirichlet 
function of Example III.5.6, which is not integrable. 


Null Sets and Discontinuous Functions 


Continuous functions f : J — R are uniformly continuous (J is compact, Theo- 
rem 2.5) and hence integrable. The proof of this fact is the same as for Theorem 
II.5.10. In the sequel, we shall prove the integrability of functions whose set of 
discontinuities is not too large. 


(5.4) Definition. A set X C I C R? is said to be a null set if for every > 0 there 


exist finitely many rectangles I, = |ax, br] X [ck, dx], (k =1,...,7) such that 
(5.15) ee Li and Sun) <e. 
k=1 k=1 


Typical null sets are the boundaries of “regular” sets, e.g., triangles, disks, 
polygons (see the example of Fig. 5.51). This is a consequence of the following 
result. 


1 r ch F 


2 = asm Al = =i} 
an aged oe i ome 
q Ne 
| | 
0 1 20 1 20 1 2 
6= a it = 0.880 6= 3p> YH = 0.475 6= 7r Sp = 0.263 


FIGURES.5. A null set 


' A null set only in the strict mathematical sense, of course! 
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(5.5) Lemma. Let y : [0,1] — R? represent a curve in the plane and suppose that 
(5.16) |lp(s) Y(t <M-|s—t] forall _s,t € (0,1). 


Then, the image set y([0, 1]) is a null set. 


Proof. We divide [0, 1] into n equidistant intervals J,, Jo,...,J, of length 1/n. 
For s,t € Jy we have ||y(s) — y(t)|loo < M/n, ie., y(J,) is contained in a 
square I;, of side < 2M/n. Therefore, the entire curve is contained in a union of 
n squares I,,...,I,, whose area is bounded by 
” “\(2M\2 4M? 

1) <5o() = ME ce 
> Ht wh) < 2, rs aye 


k=1 


if n is sufficiently large. This proves (5.15). 


Condition (5.16) is sufficient, but not necessary, for a curve to be a null set. 
For example, von Koch’s curve (von Koch 1906) of Fig. 5.6 is a null set (see Ex- 
ercise 5.5) that has infinite length (hence, (5.16) cannot be satisfied). The curve of 
Peano-Hilbert (Fig. 2.3) is not a null set, of course. However, Sierpiriski’s triangle 
and carpet (Fig. 1.9 and Fig. 1.10) are other interesting examples of null sets. 


FIGURE 5.6. A null set, the curve of von Koch 
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(5.6) Theorem. Let f : I — R be a bounded function (satisfying (5.1)) and define 
X = {(x,y) € 1; f is not continuous at (x,y) }. 
If X is a null set, then the function f (x, y) is integrable. 


Proof. Let € > 0 be given and let Bee I, be a finite covering of X satisfying 
(5.15). We enlarge the J}, slightly and consider open rectangles Jj,..., Jn such 
that J, D Jy for all k and S*7_, w(Je) < 2e. The set H := I \ Up_, Je is 
then closed (Theorems 1.15 and 1.14) and therefore compact (Theorem 1.19). Re- 
stricted to H, the function f(x, y) is uniformly continuous (Theorem 2.5), which 
means that there exists a 6 > 0 such that |f(z,y) — f(€,7)| < ¢ whenever 
|ja — €| < d and |y— | <0. 

We now start from a grid D, x Dy containing all the vertices of the rectangles 
Ji,...,Jn and refine it until the distances 7; — 7;_1 and y; — y;—1 are smaller 
than 6. We then split the difference S(D, x D,) — s(Dz x Dy) according to 


ye (Fiz — fig) wig) + S- (Fig — fag) wig). 


IijgCH Iii CA 


The sum on the left is < ej1(1) because of the uniform continuity of f(z, y) on H; 
the sum on the right is < 4Me because the union of the rectangles [;; (which do 
not lie in H) is contained in J;_, J; with an area smaller than 2<. Both estimates 
together show that S(D, x Dy) — s(Dz x Dy) can be made arbitrarily small. 


Arbitrary Bounded Domains 


Dirichlet was particularly proud for his method of the discontinuous factor 
for multiple integrals. He used to say that it’s a very simple idea, and added 
with a smile, but one must have it. 
(H. Minkowski, Jahrber. DMV, 14 (1905), p. 161) 
Let _A C R? be a bounded domain contained in a rectangle J (i.e., A C J) and let 
f : A— R bea bounded function. We want to find the volume under the surface 
z= f(x,y), with (a, y) restricted to A. 
The idea (Dirichlet 1839) is to consider the function F' : J — R defined by 


(5.17) eu if («,y)EA 
0 else. 


If F is integrable in the sense of Definition 5.1, then we define 


(5.18) [f senda = ff Fen ae) 


A common situation is where f : A — R is continuous on A and where the 
boundary of A, i-e., 


(5.19) OA := {(ow) € R? 


each neighborhood of (x, y) 
contains elements of A and of CA [’ 
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is a null set. In this case, the discontinuities of F all lie in 0A and Theorem 5.6 
implies the integrability of F’. 

Iterated Integrals. The set A can often be described in one of the following ways: 
(5.20) A={(z,y)|acr<b, gilt) Sy S y2(x)}, 

(5.21) A={(z,y)|e<y<d, vily) Se < daly) }, 


where y;(x) and w,;(y) are known functions (see Fig. 5.7). In this case, the for- 
mulas (5.11), (5.12), together with (5.18), yield 


G22) IL f(x,y) d(x, y) -[ (fo few in) dx, 
(5.23) i. f(x,y) d(e,y) = i ‘( i flow) ax) dy. 


Type (5.20) Type (5.21) Not type (5.20) 
FIGURES.7. Domains of R? 


Examples. 1) For the set A = {(2,y) | —a <a <a, 2? <y<a?} we want to 
compute the center of gravity 


Iayda,y) — 3a? 


v= =. 
ta d(x, y) 5 
We have the choice between (5.22) and (5.23): 


5.22) fe (re 4a® 
|f seen Rey, ([. vay) dx = —, 
A —a x 


a’ /2—x*/2 


a? VV 
[fam [C ([a) ante 
A 0 -V/¥ 3 


——S eee” 


2/5 
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2) Compute the moment of inertia of a disc A = { (x,y) | x? + y? < a?} rotated 
around one of its diameters: 


(5.24) r= ff ra d(x, y) -[ wees y” dy dx. 


The value of the inner integral is 2(a? — x?)*/?, and for the outer integral we use 


the substitution z = asint, dx = acost dt, Va* — x? = acost. This gives 


nm /2 9 
r=] = a costtdt = a4 
—n/2 3 4 


The following fundamental theorem on coordinate changes will considerably sim- 
plify the computation of integrals such as (5.24) (see Example 5.8 below). 


The Transformation Formula for Double Integrals 


.. this works for any other formula f Pe Zdady, since it can be transformed 
into f 7 Z(V R — ST) dtdu by the same substitutions . . . 
(Euler 1769b) 


Integration by substitution (Eq. (II.4.14)), 


g(b) b 
x) dx = u)) g'(u) du, 
qh f(z) [ #6 )) 9 (u) 


is an important tool for computing integrals. If g : [a,b] — [c, d] is bijective (and 
continuously differentiable), this formula can be written as 


d b 
iy f(a) de = / F(o(%)) lg’ (w)| du, 


where the absolute value corrects the sign in the case of g’(u) < 0 (and hence 
g(b) < g(a)). The following theorem gives the analog for double integrals. 


(5.7) Theorem (Euler 1769b, Opera, vol. XVII, p. 303 for n = 2, Lagrange 1773, 
Oeuvres, vol. 3, p.624 for n = 3, Jacobi 1841, Werke, vol.3, p.436 for arbitrary 
n). Let f : A — R be continuous, g : U — R? (U C R? open) be continuously 
differentiable, and assume that 

i) A=g(B); the sets A, B C R? are compact; 0A, OB are null sets; 

ii) g is injective on B \ N, where N is a null set. 

Then, we have 


(5.25) [f sena (x,y) = ff sto (u, v)) | det g'(u, v)| d(u, v). 
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FIGURE 5.8a. Area of the parallelogram FIGURE 5.8b. Polar coordinates 


Polar Coordinates. One of the most important applications of Theorem 5.7 is 
when 


(5.26) g(r.y)=(x,y), et=rcosy, y=rsing 


(polar coordinates, see Sect.I.5) and when A = {(2,y) | a? + y? < R?}. With 
B = (0, R] x (0, 27], the assumption (i) of Theorem 5.7 is satisfied. The function g 
of (5.26) is not injective on B (we have g(r,0) = g(r, 27) for all r, and g(0, y) = 
(0, 0) for all ~). However, if we remove from B the null set NV = ({0} x [0, 277]) U 
({0, R] x {27}) (see Fig. 5.8b), the function g becomes injective on B \ N. Since 


' _ cosp —rsing \ _ 
det gl(rp) = det (SE oe 


it follows from Theorem 5.7 and Eq. (5.11) that 


27 PR 
(5.27) // f(a, y) d(a,y) = h Fi f(rcosy,rsin y) r dr dy. 
a2+y2<R? 0 Jo 


Proof of Theorem 5.7. 
Main Ideas. We cover B by a division of closed squares Jg with side length 6 
(see Fig. 5.9, left picture), set B = {3 | JgN B # O}, and let (ug, vg) be the left 
bottom vertex of Jg. We assume that 4 is sufficiently small, so that all Js (G € B) 
still lie in U. The image set g(Jg) of Jg is approximately a parallelogram with 
sides (Fig. 5.9, right picture; Fig. 5.10) 

) Og 


g 
(5.28) a= Buy Ue? YB) }, b= Fy (ua va) d 
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(Ug, Vg) 


FIGURE5.9. Transformation of the lattice 


(see Example 3.2). Now, from elementary geometry we know that the area of this 
parallelogram is equal to the determinant ? 


ay 
a2 


(5.29) area parall. = | det(a b)| = | det ( ss | = |ayb2 — agb,|, 
2 


and, inspired by Eq. (5.9), we have 


If f(x.) d(w,y) ~ D2 F(gup,v,)) + (area of g(Ja)) 


BcB 


~ D> f(g(us, vg) | det g'(us, va) | w(Ja) 
BEB 


~ ff f (g(t v)) | det g’(u,v)| du, ). 


This motivates the validity of Eq. (5.25). 


Rigorous Estimates. The integrands in Eq. (5.25) are continuous on A and B, re- 
spectively. Since A and B are compact, these functions are bounded. Moreover, 
OA and OB are null sets, so that by (5.18) the two integrals in Eq. (5.25) exist. In 
the following we extend the domain of f to R? by putting f(x,y) = 0 outside 
of A. 

In order to grasp the precise meaning of the left integral of (5.25), we intro- 
duce, in addition to the above division of B, a division of A into squares I, set 
A= {a|Iy NA F Of, and choose (va, Ya) € La MA (these are the fish-eyes 


> The two expressions on the left and on the right of (5.29) are 
i) invariant under transformations of the type b +> b + Aa (Cavalieri’s principle), and 
ii) equal for rectangles parallel to the axis = diagonal matrices; see Fig. 5.8a. 
For more details see Strang (1976, p. 164). 
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in Fig. 5.10). Equation (5.25) will be proved by showing that the difference of the 
Riemann sums of the two integrals (see Theorem 5.2 and Eq. (5.9)) 


(5.30) S- f (®a5 Ya) bh -»~ f(g g(ug, vg) ) | det g'( UB, UB )| a Jp), 
acA BEB 


is smaller than ¢ for any given e > 0. It turns out that the side length of the squares 
I must be much smaller than 6 (the side length of Jg). We take it < €- 6. 


a. Pa TI. aaa 
+ 17)- ~ ole fe «|e 
A TEE = tals ale 
4 > : oe | 
: Aye ai Seb ia a. 
i we. |e BC a) R |e ]e 226 & ar . 
~ B oe re aie % 
ey . Pd u es a 5 ‘eiltine 
_ 4 = dei ‘ = 2 A 2V 265 » oT) ~~ =f 
BZ b 42° Re SE Pali ale << ec 
eit oA Oneal 2 else ells ire } 
ai {| JI - Sek AP oleke bf a . 
LY 4 : Stee ona 
N . MAE y sy mj . A” 
x Be Ale B45 % a 
| x ah i) ee ee 5 
4 8(Ug, ¥p) 5 a Z Hes ia Aled 4 z 
1S a 7] oh] © . 4 
sh lA 
Re ele J]. 
“) oT se A ef *L |e 
TROLLS aks ie 
sy 7 


FIGURES.10. Squares I, for which a belongs to Pg 


Partition of A. The left sum of (5.30) contains much more terms than the right 
one. In order to compare corresponding terms in this difference, we partition the 
set A as 

A= U Pe (disjoint union), 


BCB 
in such a way that 
(5.31a) (La, Ya) € g(Ja) if a € Pg, 
(5.31b) aé€ Pg if In C g(Jg) and Jg CB\N 


(see Fig.5.10). For a given a € A we can, since (%q,Ya) € A = g(B) C 
Uses (Ja), always find a 3 which satisfies (5.31a). In order to be able to satisfy 
(5.31b), we have to show that there is at most one G € B with Jg C B\ N such 
that I, C g(Jg). Suppose that In C g(Jg)N g(Je-) for some GB # (’. Since 
g is injective on B \ N, we have g(Jg) 9 g(Ja) C g(Ja M Jer), so that also 
Ig C g(Jg A Jer). But Jg M Jp is either empty, or a point, or a segment of a 
line, so that g(Jg M Ja’) is a null set by Lemma 5.5. Hence In C g(Jg N Jp’) is 
impossible for 6 4 3’. 
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Once the sets Pz determined, we use }),-4 = igen Xu , write the 


expression of Eq. (5.30) as ra: Dg with 


aePsg 


(5.32) De= S> f (ta; Ya) Ma) — f(g(ugs, va) | det g'(us,ve)| uJa), 
a€Pe 


and estimate these terms. For the moment, we consider only so-called “interior” 
Jg’s, i.e., we suppose that Jg C B\ N. We write Dg as 


(5.33a) Da = D> (f(easta) = f(g(ua,v2))) Ha) 


aePg 


(5.33b) + F(g(us,02))( > wo) = | det g" (ug, va)| w(Ja)) 
aePs 


and estimate these two expressions separately. 


Estimation of (5.33a). Since g(u,v) is continuously differentiable, g’(u,v) is 
bounded on the compact set B (Theorem 2.3), i.e., 


(5.34) \lg’(u, v)|| < My for (u,v) € B. 
Hence, the Mean Value Theorem 3.7 implies that 
|| (as Yo)” — g(ug, v_)|| <M,-6-V2 for a€ Pz 


(indeed, (tq, Ya) lies in g(Jg) and the points of Jg have from (wg, vg) a distance 
of at most 6 - 2 ). It then follows from the uniform continuity of f on A (f 
is continuous on the compact set A) that Racor Ya) — f (g(ug, ug) | < ¢ for 
sufficiently small 6 (remember that g(Jg) C A since Jg is interior). Therefore, 


Se Ss? b(Le)- 


aePg 


(5.35) 


S (F(a, Ya) = f(g(ug, vs))) Ua) 


aePg 


Estimation of (5.33b). We now must concentrate more seriously on the question 
how precisely the set g(Jg) is approached by the parallelogram spanned by the 
vectors a and b in (5.28). We denote this set by 


3) ) 
Ry = {9(us, 9) + (us, v9) 8 + (ua, vp)t | s € [0,4], t € [0,5]. 
Ou Ov 
We compute the distance of two corresponding points g(ug + s, vg + t) in g(Jg) 


and g(ug,vg)+ go (ug, ve) s+ 3a (ug, vg) t in Rg in the following way: Equation 
(IIL.6.16) written for F (7) = g(ug + 7s, vg +7t) means that 


1 
g(ug + 8, vg +t) — g(ug, vg) = | g' (ug +78, vg +7t)- i) dr. 
0 


Subtracting 0g/Ou(ug, vg) - s + Og/Ov(ug, vg) -t from both sides, we obtain 
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|o(us + s,ug +t) — (o(us, vg) + “(5,08 oe Bug, vg) t) | 


“ifr dec( alate 


for 0 < s,t < 6. The last estimate follows from the uniform continuity of g’ on 
the compact set B (recall that Jg is interior). 
Next we enclose Rg between two sets 


Rg C Re C Rg 
where (see Fig. 5.10) 


Ry = {set of points with distance < 2\/2e6 from the closest point of Re} 
Ry = {set of points in Rg with distance > 2,/26 from the border}. 


Since the distance 2,/2<5 chosen in these definitions is twice /2¢6, which, on 
one side, is the maximal distance between corresponding points of g(Jg) and Rg, 
and on the other side the maximal diameter of the squares J,, the sets Ry and 


RR also enclose, because of (5.31a) and (5.31b), the union of J, for a € Pg (see 
Fig. 5.10 again) 
~ + 
Rae. (4)' tens 
acePg 
Since Ry \ Ry is a “ring” of length < 4M16 (see (5.34)) and of “thickness” 
< A225 , the above inclusions lead to the estimate 


| So Ula) = w(Rp)| < w(RS \ Rg) < (4M13)(4V2e0). 
acePs 


Consequently, we have 
(5.36) 


IF (g(us,09))( D> (Ta) ~ | det g'(up, va)| w(Ja))| < Ced® = Cel Js) 


acePsg 


with C = M -4M, - 4/2. 

Finale. If Jg ¢ B\N (so that Jg intersects the null set OBU N), we estimate Dg 
of Eq. (5.32) by |Dg| < M2u(Jg), where Mp is a constant depending on bounds 
of f and g’. If 6 is sufficiently small, it follows from (5.15) that the sum of these 
|Dg| is < Moe. For the remaining Jg we use (5.35) and (5.36), together with 
(5.33), and obtain 

[Dal <e S> wa) + Cen Ja). 
aePg 


All in all, the difference (5.30) of the Riemann sums, i.e., 5> BEB Dg, is arbitrarily 
small (< Const - €). 
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(5.8) Example. Let A = {(a, y) | 2? +y? < R?} be the disc of radius R. Its area 
can be computed as 


27 -R R? 
[[-en- | | rdrdy = —-2n = R?n. 
A 0 Jo 2 


The moment of inertia with respect to a rotation around a diameter is 


27 pR 4 
// vate.) = f | Gaara 
A 0 Jo 4 


The moment of inertia with respect to a central rotation axis orthogonal to the disc 


iS 
Qr R 4 
[[@+Paen =f | (cara oe 
A 0 Jo 2 


FIGURE5.11. Spherical coordinates 


Spherical Coordinates. The extension of the results of this section to higher di- 
mensions can be carried out without any major difficulties. Let us give an inter- 
esting application of the transformation formula (5.25) in three dimensions. 

We consider spherical coordinates g(r, y, @) = (a, y, z) defined by (Fig. 5.11) 


(5.37) x=rcosysin#é, y=rsingsind, z=rcosé 


and are interested in triple integrals over a sphere A = {(a, y, z) | a? +y?+27 < 
R?}. With B = [0, R| x [0,27] x [0,7] and N = OB, all the assumptions of 
Theorem 5.7 are satisfied. Computing the Jacobian matrix of g, 


cosysin@ —rsinysin#@ rcosycosé 
g(r,¢,9) = | singsind rcosysin@d  rsinycosd 
cos 6 0 —rsind 


? 
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we obtain for its determinant det g'(r, , 9) = —r? sin 0, whence (Lagrange 1773) 


(5.38) [[f tenoa L,Y, % = fff x flr r,y,0)r? sin 6 d(r, y, 4), 


with flr, y, 0) = f(rcosysind,r sin ysin 6, r cos @). Looking at Fig. 5.11, this 
formula can also be understood, as Lagrange says, “directement sans aucun cal- 
cul”. 

The volume of the sphere is obtained by taking f(x, y, z) = 1, 


n ple pR 
[ff -aaua= ‘i Paiinpoe 
A 0 Jo Jo 


The moment of inertia with respect to an axis through the origin is 
2 5 

8Rer 

Hf (x? + y”) d(x, y, z =f i, fe r? sin? 6 - r? sin 6 dr dp dé = 15 


Integrals with Unbounded Domain 


In certain situations, one is confronted with the computation of an integral over an 
unbounded domain. As in Sect. III.8 (improper integrals), this can be managed by 
taking a limit. We shall illustrate this on some interesting examples. 


“Gaussian” Integral. Suppose we want to compute J = ie e-* dx. The idea 
is to take the square of J and to transform it into a double integral 


(5.39) 


3 R 5 R 5 
ass —2£ -y = ery 
Pm jim ( fi eede)( fet dy) = jim. ff deo), 


where Ar = [0, R] x [0, R]. The integrand of the double integral suggests taking 
polar coordinates. Putting Dr = {(x,y) | 27+ y? < R?,x2 > 0,y > O}, we 
have 


m/2 pR F a 
(5.40) jim a ff e x,y) = lim | e’ rdrdp=-—. 
Dr R-0o 0 0 4 

Here, the additional “r” originating from Eq. (5.27) was most welcome and al- 


lowed integration of the inner integral with an easy substitution. The question is 
whether the two limits in (5.39) and (5.40) are equal. If f(a, y) > 0 (as is the case 
here), we have 


i. f(x,y) d(a,y) < Wes f(a,y)d(a,y) < Was f(a,y) d(a,y) 
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as a consequence of the inclusion De C Ar C 
D yz p (see the small drawing to the right). 


Thus, the fa. of 1 
lim R00 pall x,y) d(x,y) implies that of 
limp—oo ff, f(t,y) d(a,y), and both lim- 


its have the same value. Consequently; I= 
/7/2. There is also an interesting connection 
with the gamma function, 0 1 


cd 2 pa dt 
5.41 =D P=) RG —t_ = (1/2 
(41) Vi Le F ee T= F(0/2) 


(see Definition III.8.10). 


v 
5h 
4b 
3 
2 
1 
ee ee eae ae ee a oe 
FIGURE5.12. Study of the transformation (5.43) 


A Product Formula for the Gamma Function. From Definition III.8.10, we 
have 


ray= fo eta tae, (6) = fey? *ay, 


so that (see Jacobi 1834, Werke, vol. VI, p.62) 


(5.42) I(a)I'(8) = lim I[« ~P—¥ 70 lyP—! dig, y), 

Rae Ar 
where, as above, Ar = [0,R] x [0, R]. This time, we use the transformation 
(Fig. 5.12) 


r+y=u : x U—vU 
(5.43) on ie., & = g(u,v) = ( hs ) : 


whose Jacobian matrix satisfies det g’(u,v) = det é fy) = 1. With Br = 
{(x,y)|2>0,y>0, x+y < R}, we find that 
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R uw 
lim ee ety! d(x, y) = lim eat (w= v)1v91 dv) du 
R-0o Br Roo Jo 0 


R 


1 
(5.44) = lim emu du. f ee ae ae 
RI dh 0 


where we have used the substitution v = u-t (0 < ¢ < 1). The same argument 
as for the Gaussian integral guarantees that the two limits of (5.42) and (5.44) are 
equal. In (5.44), the so-called beta function appears, 


1 
(5.45) B(a, 8) =| (1 — 1° '49-2 at, 
0 
and we have the formula 
_ La)l(8) 
(5.46) B(a, B) = Tat B 


which generalizes Eq. (II.4.34) to arbitrary exponents. 


Counterexample. The function f(x,y) = (x — y)/(x + y)? is continuous on 
A = [1,00] x [1, oo]. Nevertheless, we have (see also Exercise 5.3) 


(5.47) a [2 G@rys a dy # i; i) ce =H dy dz , 
SS — SS 
+1/2 —1/2 


which violates Eqs. (5.11) and (5.12). This phenomenon is only possible for an 
unbounded domain A and a function f that changes sign on A. 


Exercises 


5.1 Let g : [a,b] — R be a bounded function and assume that all its Riemann 
sums converge to a fixed value a if max;(x; — x;_1) — 0. Prove that g(r) 


is integrable (in the sense of Riemann) and that th g(x) dz =a. 
5.2 For := [0,7] x [0,1] define f : J — Rby 


_fcosx ifyEeQ 
Pay) = i if not. 


Which of the two integrals 


[(f semae) ay and [Uf tena) a 


exists? Is the function f : J — R integrable? 
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33 


5.4 


5.5 


5.6 
5.7 


Show that 


(Fig. 5.13). Is this relation a contradiction to Eqs. (5.11) and (5.12)? 


: 6) —2£ 2 My 
Hint. Use Gia) = (ety) 


FIGURES.13. Function aut with noncommuting iterated integrals (stereogram) 


Try to compute 


wT R 

—2 2 

| (| eos Ft ar) dp, ea ee 
0 Wo 1-2rcosp+r? 


There is a better way of computing this integral, where the formula (see the 
Example for (II.5.21)) 


ie dy T a 
— i a 
9 a+bcosy a2 — b2 


is helpful (the result is J = 0; see also Exercise III.5.4). 


Prove that von Koch’s curve of Fig.5.6, though of infinite length, represents 
a null set. 

Hint. Let the distance of the two end points be 1. Considering the uppermost 
curve of Fig. 5.6, we see that it is contained in a rectangle of sides 1 and 1/3. 
The next curve is contained in the union of four rectangles of sides 1/3 and 
1/9, and so on. 


Show that “Sierpiriski’s triangle” (Fig. 1.9) is a null set in R?. 
The set y((0, 1]), with 


is a null set despite the fact that the function y does not satisfy 


lly(t) — v(s)|| < Mt— s| for all t,s € [0,1]. 
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5.8 Compute the area of the surface enclosed by the loop of the folium cartesii 
(4.29). Try two methods: a) polar coodinates; b) the change of coordinates 


U=r+y, v=2-y. 
5.9 Compute the area of the surface enclosed by the loops of the lemniscate 
(a? + y)? — 2(a? — y”) =0. 


Try two methods: a) polar coodinates; b) iterated integrals. 


5.10 Let 
Ba (r) ={ (Gis y Sn) € R35 af +... 40% < 77} 


be the ball of radius r in R”. Show that its volume is 
nz yr 


T(r) = Nea d(a1,...,%n) = T@+l) 


Indication. Proceed by induction on n > 1. A formula derived above for the 
beta function will be helpful. 


5.11 Compute the volume of the simplex 
An(c) = Pigs «Bq ) ER"; 24; >0 and a1 +22+...+ 2% Gh: 


The result is c” /n!. 


Il vyz(l—a—y-—z)dxdydz, 
T 


where T’ is the tetrahedron defined by 


5.12 Compute 


T={(2,y,2);22>0,y2>0,2>0,2+yt+2< 1}. 
Use the substitution 
r+yt+z=u, y+z=ut, z= Uw. 


The result is 1/7!. 


5.13 Let Ar = [0, R] x [0, RJ], Dr = {(a, y) | x? + y? < R?}, and consider the 
limits 


lim // sin(a? + y”)d(a,y), lim // sin(x? + y*) d(a, y). 
R00 JI An R-0 JI Dp 


Prove that the first limit exists, whereas the second does not. 

Hint. For the first integral use sin(x? + y?) = sin x? cos y? + cosa? sin y? 
and prove that ton sin x? dx converges to a limit for R — oo. For the second 
integral use polar coordinates. 
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5.14 Prove that 


(5.48) [= COSE r=, OS ae us 
0 0 2 


Then, deduce from these relations the statement of Eq. (II.6.9). 
Hint. Substituting x = u,/z (z is a positive parameter) in Eq. (5.41) yields 


2 
e *“ du. 


(5.49) = = => | 
vz VT Jo 
Multiply this equation by e’”, integrate from A > 0 to B, change the order 
of integration in the iterated integrals, and consider the limits B — oo and 
A — 0. Justify all steps. 

Remark. With deeper results of complex analysis, this becomes an easy ex- 
ercise. 
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page 6: 


page 8: 


page 18: 


page 29: 


page 34: 


page 43: 


page 52: 


page 57: 


page 58: 


page 62: 


... da der Lehrer einsichtig genug war den ungewohnlichen Schiiler (Jacobi) 
gewiahren zu lassen und es zu gestatten, da dieser sich mit Eulers Introductio 
beschaftigte, wahrend die tibrigen Schiiler miihsam ... . 

(Dirichlet 1852, Gediachtnisrede auf Jacobi, in Jacobi’s Werke, vol. 1, p. 4) 


Tant que I’ Algébre et la Géométrie ont été séparées, leurs progrés ont été lents 
et leurs usages bornés; mais lorsque ces deux sciences se sont réunies, elles se 
sont prété des forces mutuelles et ont marché ensemble d’un pas rapide vers 
la perfection. C’est 4 Descartes qu’on doit l’ application de Il’ Algébre a la Géo- 
métrie, application qui est devenue la clef des plus grandes découvertes dans 
toutes les branches des Mathématiques. 

(Lagrange 1795, Oeuvres, vol.7, p.271) 


Diophante peut étre regardé comme I’inventeur de |’ Algébre; . . . 
(Lagrange 1795, Oeuvres, vol.7, p. 219) 


Tartalea exposa sa solution en mauvais vers italiens ... 
(Lagrange 1795, Oeuvres, vol. 7, p. 22) 


... trovato la sua regola generale, ma per al presente la voglio tacere per piu 
rispetti. (Tartaglia 1530, see M. Cantor 1891, vol. II, p. 485) 


Le Logistique Numerique est celuy qui est exhibé & traité par les nombres, 
le Specifique par especes ou formes des choses: comme par les lettres de 
1’ Alphabet. (Viéte 1600, Algebra nova, French ed. 1630) 


Ou ie vous prie de remarquer en passant, que le scrupule, que faisoient les 
anciens d’vser des termes de |’ Arithmetique en la Geometrie, qui ne pouuoit 
proceder, que de ce qu’ils ne voyoient pas assés clairement leur rapport, causoit 
beaucoup d’obscurité, & d’embaras, en la fagon dont ils s’expliquoient. 
(Descartes 1637) 


Quoy que cette proposition ait vne infinité de cas, i’en donneray vne demon- 
stration bien courte, en supposant 2 lemmes. 
Le 1. qui est evident de soy-mesme, que cette proportion se rencontre dans la 
seconde base; car il est bien visible que y est 4o comme 1, a 1. 
Le 2. que si cette proportion se trouue dans vne base quelconque, elle se trou- 
uera necessairement dans la base suivante. 

(Pascal 1654, one of the first induction proofs) 


Der Begriff des Logarithmus wird von den Schiilern im allgemeinen nur sehr 
schwer verstanden. (van der Waerden 1957, p. 1) 


Mense Septembri 1668, Mercator Logarithmotechniam edidit suam, quae spec- 
imen hujus Methodi (i.e., Serierum Infinitarum) in unica tantum Figura, nempe, 
Quadratura Hyperbole continet. (Letter of Collins, Julii 26, 1672) 


Die Gleichungen ... haben . .. ein ehrwiirdiges Alter. Schon Ptolemaus leitet 
a (L. Vietoris 1949, J. reine ang. Math., vol. 186, p. 1) 


. vous ne laisserez pas d’avoir trouvé une proprieté du cercle tres remar- 
quable, ce qui sera celebre a jamais parmi les geometres. 
(Letter of Huygens to Leibniz, Nov. 7, 1674) 


Au reste tant les vrayes racines que les fausses ne sont pas tousiours reelles; 
mais quelquefois seulement imaginaires; c’ est a dire qu’on peut bien tousiours 
en imaginer autant que iay dit en chasque Equation; mais qu’il n’y a quelque- 
fois aucune quantité, qui corresponde a celles qu’on imagine. 

(Descartes 1637, p. 380) 


... quomodo quantitates exponentiales imaginariae ad sinus et cosinus arcuum 
realium reducantur. (Euler 1748, Introductio, §138) 


...etie voy déja la route de trouver la somme de cette rangée t + i + ; + Fete. 
(Joh. Bernoulli, May 22, 1691, letter to his brother) 
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page 68: 


page 70: 


page 76: 


page 80: 


page 81: 


page 92: 


page 98: 


page 118: 


La Théorie des fractions continues est une des plus utiles de 1’ Arithméti- 
que ... comme elle manque dans les principaux Ouvrages d’ Artihmétique et 
d’Algébre, elle doit étre peu connue des géométres ... je serai satisfait si je 
puis contribuer 4 la leur rendre un peu plus familiére. 

(Lagrange 1793, Oeuvres, vol. 7, p. 6-7) 


Die Veranlassung aber, diese Formeln zu suchen, gab mir des Herrn Eu- 
lers Analysis infinitorum, wo der Ausdruck ... in Form eines Beyspieles 
vork6mmt. (Lambert 1770a) 


Ich kann mit einigem Grunde zweifeln, ob gegenwartige Abhandlung von den- 
jenigen werde gelesen, oder auch verstanden werden die den meisten Antheil 
davon nehmen sollten, ich meyne von denen, die Zeit und Miihe aufwenden, 
die Quadratur des Circuls zu suchen. Es wird sicher genug immer solche geben 
... die von der Geometrie wenig verstehen ... (Lambert 1770a) 


L’étendué de ce calcul est immense: il convient aux Courbes mécaniques, 
comme aux géometriques; les signes radicaux luy sont indifferens, & méme 
souvent commodes; il s’étend 4 tant d’indéterminées qu’on voudra; la com- 
paraison des infiniment petits de tous les genres luy est également facile. Et 
de 14 naissent une infinité de découvertes surprenantes par rapport aux Tan- 
gentes tant courbes que droites, aux questions De maximis & minimis, aux 
points d’infléxion & de rebroussement des courbes, aux Dévelopées, aux Caus- 
tiques par réfléxion ou par réfraction, &c. comme on le verra dans cet ouvrage. 

(Marquis de L’Hospital 1696, Analyse des infiniment petits) 


Et j’ose dire que c’est cecy le problésme le plus utile, & le plus general non 
seulement que ie scache, mais mesme que i’aye iamais desiré de scauoir en 


Geometrie .. . (Descartes 1637, p. 342) 
Quel mépris pour les non-Anglois! Nous les avons trouvé ces methodes, sans 
aucun secours des Anglois. (Joh. Bernoulli 1735, Opera, vol. IV, p. 170) 


Ce que tu me rapportes 4 propos de Bernard Niewentijt n’est que quincaillerie. 
Qui pourrait s’empécher de rire devant les ratiocinations si ridicules qu’il batit 
sur notre calcul, comme s’il était aveugle 4 ses avantages. 

(Letter of Joh. Bernoulli, quoted from Parmentier 1989, p. 316). 


Nous appellerons la fonction fx, fonction primitive, par rapport aux fonctions 
f'a, fx, &c. qui en dérivent, et nous appellerons celles-ci, fonctions dérivées, 
par rapport a celle-la. (Lagrange 1797) 


Je desire seulement qu’il sache que nos questions de maximis et minimis et 
de tangentibus linearum curvarum sont parfaites depuis huit ou dix ans et 
que plusieurs personnes qui les ont vues depuis cinq ou six ans le peuvent 
témoigner. 

(Letter from Fermat to Descartes, June 1638, Oeuvres, tome 2, p. 154-162) 


Mon Frére, Professeur 4 Bale, a pris de 14 occasion de rechercher plusieurs 
courbes que la Nature nous met tous les jours devant les yeux .. . 
(Joh. Bernoulli 1692) 


Je suis tres persuadé qu’il n’y a gueres de geometre au monde qui vous puisse 


étre comparé. (de L’Hospital 1695, letter to Joh. Bernoulli) 
La quantité cy dessus 
ppads 
qqss — ppaa 


se reduit immediatement, sans autre changement, 4 deux fractions logarithmi- 
cales, en la partageant ainsi 
ppads > ipds pds 
qqss—ppaa qs—pa  qs+pa 
(Annex to a letter of Joh. Bernoulli, 1699, see Briefwechsel, vol. 1, p.212) 


page 126: 


page 135: 


page 136: 


page 137: 


page 140: 


page 144: 


page 154: 
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Problema 3: Si X denotet functionum quamcunque rationalem fractam ipsius 
x, methodum describere, cuius ope formulae X dz integrale investigari conve- 
niat. (Euler 1768, Opera Omnia, vol. XI, p. 28) 


... weil die Analysten nach allen Versuchen endlich geschlossen haben, daB 
man die Hoffnung aufgeben miisse, elliptische Bogen durch algebraische For- 
meln, Logarithmen und Circulbogen auszudriicken. 

(J.H. Lambert 1772, Opera, vol. I, p. 312) 


Bien que le probléme (des quadratures) ait une durée de deux cents ans 
a peu prés, bien qu’il était l’objet de nombreuses recherches de plusieurs 
géométres : Newton, Cotes, Gauss, Jacobi, Hermite, Tchébychef, Christoffel, 
Heine, Radeau [sic], A. Markov, T. Stitjes [sic], C. Possé, C. Andréev, N. Sonin 
et d’autres, il ne peut étre considéré, cependant, comme suffisamment épuisé. 
(Steklov 1918) 


On s’assurera aisément par notre méthode que |’intégrale 7 ede dont les 


Géométres se sont beaucoup occupés, est impossible sous forme finie . . . 
(Liouville 1835, p. 113) 


Claudius Perraltus Medicus Parisinus insignis, tum & Mechanicis atque Ar- 
chitectonicis studiis egregius, & Vitruvii editione notus, idemque in Regia sci- 
entiarum Societate Gallica, dum viveret, non postremus, mihi & aliis ante me 
multis proposuit hoc problema, cujus nondum sibi occurrisse solutionem in- 
genue fatebatur ... (Leibniz 1693) 


Mais pour juger mieux de l’excellence de vostre Algorithme j’attens avec im- 
patience de voir les choses que vous aurez trouvées touchant la ligne de la 
corde ou chaine pendante, que Mr. Bernouilly vous a proposé 4 trouver, dont 
je luy scay bon gré, parce que cette ligne renferme des proprietez singulieres 
et remarquables. Je l’avois considerée autre fois dans ma jeunesse, n’ ayant que 
15 ans, et j’avois demontré au P. Mersenne, que ce n’estoit pas une Parabole 

(Letter of Huygens to Leibniz, Oct. 9, 1690) 


Les efforts de mon frere furent sans succés, pour moi, je fus plus heureux, car 
je trouvai l’adresse ... Il est vrai que cela me couta des meditations qui me 
deroberent le repos d’une nuit entiere ... 

(Joh. Bernoulli, see Briefwechsel, vol. 1, p. 98) 


Datis in plano verticali duobus punctis A et B assignare mobili M, viam AM B 
per quam gravitate sua descendens et moveri incipiens a puncto A, brevissimo 
tempore perveniat ad alterum punctum B. (Joh. Bernoulli 1696) 


Ce probléme me paroist des plus curieux et des plus jolis que l’on ait encore 
proposé, et je serois bien aise de m’y appliquer, mais pour cela il seroit neces- 
saire que vous me |’envoyassiez réduit 4 la mathematique pure, car le phisique 
m’embarasse ... (de L’ Hospital, letter to Joh. Bernoulli, June 15, 1696) 


En vérité rien n’est plus ingenieux que la solution que vous donnez de |’ égalité 

de Mr. votre frere; & cette solution est si simple qu’on est surpris que ce 

probléme ait paru si difficile: c’est 14 ce qu’on appelle une élégante solution. 
(P. Varignon, letter to Joh. Bernoulli “6 Aoust 1697”) 


Per liberare la premessa formula dalle seconde differenze, ... , chiamo p la 
sunnormale BF. (Riccati 1712) 


. eS ist ganz unméglich, heute noch eine Zeile von d’Alembert hinun- 
terzuwtirgen, wahrend man die meisten Eulerschen Sachen noch mit Entzticken 
liest. (Jacobi, see Spiess 1929, p. 139) 


Ich habe immer wieder beobachtet, da8 Mathematiker und Physiker mit ab- 
geschlossenem Examen tiber theoretische Ergebnisse sehr gut, aber tiber die 
einfachsten Naherungsverfahren nicht Bescheid wuBten. 

(L. Collatz 1951, Num. Beh. Diffgl., Springer-Verlag) 


PROBLEMA 85: Proposita aequatione differentiali quacunque eius integrale 
completum vero proxime assignare. (Euler 1768, §650) 
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page 181: 
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PROBLEMA 86: Methodum praecedentem aequationes differentiales proxime 
integrandi magis perficere, ut minus a veritate aberret. (Euler 1768, 8656) 


Der K6nig nennt mich ‘meinen Professor’, und ich bin der gliicklichste Mensch 
auf der Welt! (Euler is proud to serve Frederick II in Berlin) 


J’ai ici un gros cyclope de géométre ... il ne reste plus qu’un oeil a notre 
homme, et une courbe nouvelle, qu’il calcule 4 présent, pourrait le rendre aveu- 
gle tout a fait. (Frederick II, see Spiess 1929, p. 165-166.) 


...et je ne réponds pas que je fasse encore de la géométrie dans dix ans d’ici. I 
me semble aussi que la mine est presque déja trop profonde et ... il faudra tot 
ou tard l’abandonner. La physique et la chimie offrent maintenant des richesses 
plus brillantes et d’une exploitation plus facile ... 

(Lagrange, Sept. 21, 1781, Letter to d’Alembert, Oeuvres, vol. 13, p. 368) 


On dit qu’une grandeur est la limite d’une autre grandeur, quand la seconde 
peut approcher de la premiére plus prés que d’une grandeur donnée, si petite 
qu’on la puisse supposer, .. . 

(D’ Alembert 1765, Encyclopédie, tome neuvieme, a Neufchastel) 


Lorsqu’une quantité variable converge vers une limite fixe, il est souvent utile 
d’indiquer cette limite par une notation particuliére, c’est ce que nous ferons, 
en placant I’ abréviation 
lim 
devant la quantité variable dont il s’agit .. . 
(Cauchy 1821, Cours d’Analyse) 


... Je mehr ich ueber die Principien der Functionentheorie nachdenke — und 

ich thue dies unablissig —, um so fester wird meine Ueberzeugung, dass diese 

auf dem Fundamente algebraischer Wahrheiten aufgebaut werden muss ... 
(Weierstrass 1875, Werke, vol. 2, p. 235) 


Bitte vergi® alles, was Du auf der Schule gelernt hast; denn Du hast es nicht gel- 
ernt. ... indem meine Tochter bekanntlich schon mehrere Semester studieren 
(Chemie), schon auf der Schule Differential- und Integralrechnung gelernt zu 
haben glauben und heute noch nicht wissen, warum x - y = y- x ist. 

(Landau 1930) 


/3 ist also nur ein Zeichen fiir eine Zahl, welche erst noch gefunden werden 
soll, nicht aber deren Definition. Letztere wird jedoch in meiner Weise, etwa 
durch 

(1.7, 1.73, 1.732, ...) 


befriedigend gegeben. (G. Cantor 1889) 
... Definition der irrationalen Zahlen, bei welcher Vorstellungen der Geome- 
trie ... oft verwirrend eingewirkt haben. ... Ich stelle mich bei der Definition 


auf den rein formalen Standpunkt, indem ich gewisse greifbare Zeichen Zahlen 
nenne, so dass die Existenz dieser Zahlen also nicht in Frage steht. 
(Heine 1872) 


Fiir mich war damals das Gefiihl der Unbefriedigung ein so iiberwiltigendes, 
dass ich den festen Entschluss fasste, so lange nachzudenken, bis ich eine 
rein arithmetische und véllig strenge Begriindung der Principien der Infinites- 
imalanalysis gefunden haben wiirde. ... Dies gelang mir am 24. November 
1858, ... aber zu einer eigentlichen Publication konnte ich mich nicht recht 
entschliessen, weil erstens die Darstellung nicht ganz leicht, und weil ausser- 
dem die Sache so wenig fruchtbar ist. (Dedekind 1872) 


Die Analysis zu einem blossen Zeichenspiele herabwiirdigend .. . 
(Du Bois-Reymond 1882, Allgemeine Funktionentheorie, Tiibingen) 


... Jjusqu’a présent on a regardé ces propositions comme des axiomes. 
(Méray 1869, see Dugac 1978, p. 82) 


Une chose étonnante, je trouve, c’est que Monsieur Weierstrass et Monsieur 
Kronecker peuvent trouver tant d’auditeurs — entre 15 et 20 — pour des cours 


page 188: 


page 202: 


page 204: 


page 206: 


page 209: 


page 213: 


page 217: 
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si difficiles et si élevés. 
(letter of Mittag-Leffler 1875, see Dugac 1978, p. 68) 


Je consacrerai toutes mes forces 4 répandre de la lumiére sur l’immense obscu- 
rité qui régne aujourd’ hui dans |’ Analyse. Elle est tellement dépourvue de tout 
plan et de tout systéme, qu’on s’étonne seulement qu’il y ait tant de gens qui 
s’y livrent — et ce qui pis est, elle manque absolument de rigueur. 

(Abel 1826, Oeuvres, vol. 2, p. 263) 


Cauchy est fou, et avec lui il n’y a pas moyen de s’entendre, bien que pour le 
moment il soit celui qui sait comment les mathématiques doivent étre traitées. 
Ce qu’il fait est excellent, mais trés brouillé .. . 

(Abel 1826, Oeuvres, vol. 2, p. 259) 


On appelle ici Fonction d’une grandeur variable, une quantité composée de 
quelque maniére que ce soit de cette grandeur variable & de constantes. 
(Joh. Bernoulli 1718, Opera, vol. 2, p. 241) 


Quocirca, si f(= +c) denotet functionem quamcunque ... 
(Euler 1734, Opera, vol. XXII, p. 59) 


Entspricht nun jedem z ein einziges, endliches y, ... so heisst y eine .. . Func- 
tion von « fiir dieses Intervall. ... Diese Definition schreibt den einzelnen 
Theilen der Curve kein gemeinsames Gesetz vor; man kann sich dieselbe aus 
den verschiedenartigsten Theilen zusammengesetzt oder ganz gesetzlos geze- 
ichnet denken. (Dirichlet 1837) 


... f(x) sera fonction continue, si... la valeur numérique de la différence 


f(a+a)— f(a) 
décroit indéfiniment avec celle dea... 
(Cauchy 1821, Cours d’Analyse, p. 43) 


Wir nennen dabei eine Grésse y eine stetige Function von x, wenn man nach 
Annahme einer Grosse ¢ die Existenz von 6 beweisen kann, sodass zu jedem 
Wert zwischen xo — 6...2%0 + 6 der zugehérige Wert von y zwischen yo — 
€...yo+€ liegt. (Weierstrass 1874) 


Ce théoréme est connu depuis longtemps .. . 
(Lagrange 1807, Oeuvres, vol. 8, p. 19, see also p. 133) 


In seinem Satze, dem zufolge eine stetige Funktion einer reellen Veranderlichen 
ihre obere und untere Grenze stets wirklich erreicht, d.h. ein Maximum und 
Minimum notwendig besitzt, schuf WEIERSTRASS ein Hilfsmittel, dass heute 
kein Mathematiker bei feineren analytischen oder arithmetischen Untersuchun- 
gen entbehren kann. (Hilbert 1897, Gesammelte Abh., vol. 3, p.333) 


Der Begriff des Grenzwertes einer Funktion ist wohl zuerst von Weierstrass mit 
gentigender Schiarfe definiert worden. 
(Pringsheim 1899, Enzyclopddie der Math. Wiss., Band II.1, p. 13) 


Dans louvrage de M.Cauchy on trouve le théor¢me suivant: “Lorsque les 
différens termes de la série uo + U; + U2 +... sont des fonctions ... contin- 
ues, ... lasomme s de la série est aussi ... fonction continue de x.” Mais il me 
semble que ce théoréme admet des exceptions. Par exemple la série 


sin x — 3sin2x+4sin3a... 


est discontinue pour toute valeur (2m + 1)m de x,... 
(Abel 1826, Oeuvres, vol. 1, p. 224-225) 


Es scheint aber noch nicht bemerkt zu sein, dass . . . diese Continuitat in jedem 
einzelnen Punkte ... nicht diejenige Continuitit ist ... die man gleichmdssige 
Continuitdt nennen kann, weil sie sich gleichmassig tiber alle Punkte und alle 
Richtungen erstreckt. (Heine 1870, p. 361) 


Den allgemeinen Gang des Beweises einiger Satze im §. 3 nach den Princip- 
ien des Herrn Weierstrass kenne ich durch miindliche Mittheilungen von ihm 
selbst, von Herrn Schwarz und Cantor, so dass .. . (Heine 1872, p. 182) 
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Also zuerst: Was hat man unter a i f(x) dx zu verstehen? 
(Riemann 1854, Werke, p. 239) 


Villustre géométre [Riemann] ... généralise, par une de ces vues qui n’appar- 
tiennent qu’ aux esprits de premier ordre, la notion de |’intégrale définie, ... 
(Darboux 1875) 


Ich fiihle indessen, dass die Art, wie das Criterium der Integrirbarkeit formulirt 
wurde, etwas zu wiinschen tibrig lasst. (Du Bois-Reymond 1875, p. 259) 


Bis in die neueste Zeit glaubte man, es sei das Integral einer convergenten Reihe 
... gleich der Summe aus den Integralen der einzelnen Glieder, und erst Herr 
Weierstrass hat bemerkt ... 

(Heine 1870, Ueber trig. Reihen, J.f. Math., vol. 70, p. 353) 


Da diese Functionen noch nirgends betrachtet sind, wird es gut sein, von einem 
bestimmten Beispiele auszugehen. (Riemann 1854, Werke, p. 228) 


... la rigueur, dont je m’étais fait une loi dans mon Cours d’analyse, . . . 
(Cauchy 1829, Lecons) 


Die vollsténdige Veranderung f(a +h) — f(x)... lasst sich im allgemeinen in 
zwei Teile zerlegen ... (Weierstrass 1861) 


Voir la belle démonstration de ce théoréme, donnée par M. O. Bonnet, dans le 
Traité de Calcul différentiel et intégral de M. Serret, t. I, p. 17. 
(Darboux 1875, p. 111) 


... tout a fait au-dessus de la vaine gloire, que la plupart des Sgavans recher- 
chent avec tant d’avidité ... 

(Fontenelle’s opin- 
ion concerning Guillaume-Frangois-Antoine de Lhospital, Marquis de Sainte-- 
Mesme et du Montellier, Comte d’Antremonts, Seigneur d’Ouques, 1661— 
1704) 


Au reste je reconnois devoir beaucoup aux lumieres de Mrs Bernoulli, sur tout 
a celles du jeune presentement Professeur 4 Groningue. Je me suis servi sans 
facon de leurs découvertes ... (de L’Hospital 1696) 


Oi est-il démontré qu’ on obtient la différentielle d’une série infinie en prenant 
la différentielle de chaque terme? 
(Abel, Janv. 16, 1826, Oeuvres, vol. 2, p. 258) 


. et de juger de la valeur du reste de la série. Ce probléme, |’un des plus 
importants de la théorie des séries, n’a pas encore été résolu .. . 
(Lagrange 1797, Oeuvres, vol. 9, p. 42-43, 71) 


... la formule de TAYLOR, cette formule ne pouvant plus étre admise comme 
générale ... (Cauchy 1823, Résumé, p. 1) 


... mais celui qui me fait le plus de plaisir c’est un mémoire ... sur la simple 
série 
mim — | 
(m=1) 2 
2 


J’ose dire que c’est la premiére démonstration rigoureuse de la formule bindme 
(Abel, letter to Holmboe 1826, Oeuvres, vol.2, p.261) 


Bis auf die neueste Zeit hat man allgemein angenommen, dass eine ... con- 
tinuirliche Function ... auch stets eine erste Ableitung habe, deren Werth nur 
an einzelnen Stellen unbestimmt oder unendlich gross werden kénne. Selbst 
in den Schriften von Gauss, Cauchy, Dirichlet findet sich meines Wissens 
keine Ausserung, aus der unzweifelhaft hervor ginge, dass diese Mathematiker, 
welche in ihrer Wissenschaft die strengste Kritik iiberall zu tiben gewohnt 
waren, anderer Ansicht gewesen seien. (Weierstrass 1872) 


l+mar+ 


Il y acent ans, une pareille fonction eut été regardée comme un outrage au sens 
commun. 
(Poincaré 1899, L’oeuvre math. de Weierstrass, Acta Math., vol. 22, p.5) 


page 266: 


page 273: 


page 278: 


page 283: 


page 287: 


page 291: 


page 300: 


page 302: 


page 316: 


page 330: 


page 336: 


page 338: 


Appendix: Original Quotations 357 


Telle est la proposition fondamentale qui a été établie par Weierstrass. 
(Borel 1905, p. 50) 


Es mag auffallend erscheinen, dass diese so einfache Idee, welche im Grunde 
genommen in weiter nichts besteht, als dass eine Vielfachsumme verschiedener 
Gréssen (als welche hiernach die extensive Grosse erscheint) als selbststandige 
Grosse behandelt wird, in der That zu einer neuen Wissenschaft sich entfalten 
soll; ... (Grassmann 1862, Ausdehnungslehre, p.5) 


. 1 est trés utile d’introduire la considération des nombres complexes, ou 
nombres formés avec plusieurs unités, .. . 
(Peano 1888a, Math. Ann., vol. 32, p. 450) 


Unter einer “Menge” verstehen wir jede Zusammenfassung M/ von bestimmten 
wohlunterschiedenen Objekten m unserer Anschauung oder unseres Denkens 
(welche die “Elemente” von M genannt werden) zu einem Ganzen. 

(G. Cantor 1895, Werke, p. 282) 


Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben 
k6nnen. (Hilbert, Math. Ann., vol. 95, p. 170) 


Nous avons déja signalé et nous reconnaitrons dans tout le cours de ce Livre 
l’importance des ensembles compacts. Tous ceux qui ont eu as’ occuper d’ Ana- 
lyse générale ont vu qu’il était impossible de s’en passer. 
(Fréchet 1928, Espaces abstraits, p. 66) 
... ist die Schwierigkeit, welche nach dem Urtheile aller Mathematiker . . . das 
Studium jenes Werkes wegen seiner ... mehr philosophischen als mathema- 
tischen Form dem Leser bereitet .. .. Jene Schwierigkeit nun zu beheben, war 
daher eine wesentliche Aufgabe fiir mich, wenn ich wollte, dass das Buch nicht 
nur von mir, sondern auch von anderen gelesen und verstanden werde. 
(Grassmann 1862, “Professor am Gymnasium zu Stettin’”’) 


Eine stetige Kurve kann Flichenstiicke enthalten: das ist eine der merkwiirdig- 
sten Tatsachen der Mengenlehre, deren Entdeckung wir G. Peano verdanken. 
(Hausdorff 1914, p. 369) 


Wir Deutsche gebrauchen statt dessen nach Jacobi’s Vorgange fiir partielle 
Ableitungen das runde 0. (Weierstrass 1874) 


... da WeierstraB’ unmittelbarer Unterricht die Spontanitaét der Horer zu sehr 
unterdriickte und in der Tat nur fiir den voll verstandlich war, der schon an- 
derweitig mit dem Stoff sich vertraut gemacht hatte. Die gréBeren Werke sind 
von Ausliandern geschrieben ... Wohl das erste stammt von meinem Freunde 
Stolz (Innsbruck): “Vorlesungen tiber allgemeine Arithmetik” .. .. 

(F. Klein 1926, Entwicklung der Math., p.291) 


Or il est facile de voir que les différentielles de cette espéce conservent les 
mémes valeurs quand on intervertit l’ordre suivant lequel les différentiations 
relatives aux diverses variables doivent étre effectuées. 

(Cauchy 1823, Résumé, p. 76) 


On sait que I’ évaluation ou méme la réduction des intégrales multiples présente 
généralement de trés grandes difficultés .. . 

(Dirichlet 1839, Werke, vol. 1, p.377) 
Besonderen Stolz legte Dirichlet auf seine Methode des diskontinuierlichen 
Faktors zur Bestimmung vielfacher Integrale. Er pflegte zu sagen, es ist das 
ein sehr einfacher Gedanke, und schmunzelnd hinzuzufiigen, aber man muss 
ihn haben. (H. Minkowski, Jahrber. DMV, 14 (1905), p. 161) 


... locum habet pro quacunque alia formula af ce Zdady, quippe quae per eas- 


dem substitutiones transformatur in hac ri f ZVR—ST)dtdu... 
(Euler 1769b) 
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